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Market Liquidity, Hedging, and Crashes 


By GERARD GENNOTTE AND HAYNE LELAND* 


In the absence of significant news, hedging strategies were blamed for the stock 
market crash of October 1987; but traditional models cannot explain how a 
relatively small amount of selling could cause so large a price drop. We develop a 
rational expectations model in which prices play an important role in shaping 
expectations; markets are much less liquid in our model than in traditional 
models. Discontinuities (or “crashes”) can occur even with relatively little 
hedging. The model is consistent with theories as disparate as Keynes’ “‘beauty 
contest” insight and Thom’s “catastrophe” analysis and suggests means to reduce 


volatility. (JEL 313) 


Immediately following the stock market 
crash of October 19, 1987, both practition- 
ers and academics sought an explanation 
based on external events. While several 
trends were clearly “bad news” for the mar- 
ket, these trends had been revealing them- 
selves over the previous months. It was 
difficult to isolate new events occurring 
between October 16 and October 19 that 
were of sufficient importance to explain the 
magnitude of the price fall. 

The Brady Commission’s examination of 
the October break (Nicholas Brady et al., 
1988) therefore centered on internal market 
causes rather than external events. In par- 
ticular, the Commission focused attention 
on a number of large institutions following 
“price insensitive” strategies such as portfo- 
lio insurance.’ 

In dramatic language, the Brady Report 
painted a picture of enormous waves of 
institutional selling driving down prices ex- 


*Walter A. Haas School of Business, University of 
California, Berkeley, CA 94720. The authors thank 
Fischer Black, Milt Harris, Roy Henriksson, Guy 
Laroque, Ailsa Roeli, Mark Rubinstein, Toshi Shibano, 
and especially Pete Kyle for extended discussions across 
many subjects. Support from the Berkeley Program 
in Finance and from Deutsche Forschungsgemein- 
schaft, Gottfried-Wilhelm-Liebniz-Forderpreis, during 
BoWo89 is gratefully acknowledged. 

Portfolio insurance strategies are dynamic hedging 
strategies which provide protection by replicating a put 
option (see Mark Rubinstein and Leland, 1981). These 
strategies have the property that they tend to sell after 
the market has declined and to buy after market rises. 
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cessively. The report claimed that such sell- 
ers suffered from an “illusion of liquidity”; 
and it buttressed this conclusion by pointing 
out that a few large sellers alone sold about 
$6 billion of stock and stock index futures. 

Although there has been a unanimous 
positive response to the Brady Commission’s 
marshalling of facts, there has not been a 
unanimous acceptance of their interpreta- 
tions of these facts. Formal portfolio- 
insurance strategies were followed by less 
than 3 percent of stock market funds.? While 
the $6 billion sold by portfolio insurers 
seems a large amount, it represented only 
15 percent of total stock and stock index 
futures volume on October 19. In absolute 
terms, the $6 billion amounts to less than 
0.2 percent of the roughly $3.5 trillion of 
equity value at the beginning of the day. Is 
it reasonable to think that selling 0.2 per- 
cent can drive down prices by over 20 per- 
cent—that selling $6 billion can cause losses 
of $700 billion? Of course the answer de- 
pends upon the elasticity of demand for 
stocks. But traditional models imply an elas- 
ticity much greater than the market exhib- 
ited on October 19, 1987. 


? Best estimates suggested $70-$100 billion in funds 
were following formal portfolio insurance programs. 
On a precrash total stocks value of about $3.5 trillion, 
this represents 2-3 percent. Of course, informal hedg- 
ing strategies such as stop-loss selling may have 
amounted to considerably more than this (see the 
survey of Robert Shiller [1987]). 
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A recent study by Michael Brennan and 
Eduardo Schwartz (1989) suggests that a 
5-percent use of portfolio insurance by in- 
vestors would have a minimal impact on 
market prices and volatility. Their model 
and other commonly studied portfolio /con- 
sumption models suggest elasticities or de- 
mand for stock far greater than 1—more 
than 100 times the elasticity implied by the 
Brady Commission’s conclusions. Informa- 
tion changes, rather than selling by portfolio 
insurers, are needed to explain October 19 
in these standard models. 

Other evidence does not seem to confirm 
a strong connection between the crash and 
portfolio insurance. If short-run selling were 
the cause of the decline, we might expect a 
quick reversal, but this did not occur. Fur- 
thermore, Richard Roll’s (1989) cross- 
market studies showed little correlation be- 
tween October 1987 performance and 
various aspects of markets, including 
whether portfolio insurance was used. 

In sum, the crash of 1987 presents the 
following dilemma to current financial mod- 
els: the amount of selling seems insufficient 
to explain the large price drop observed on 
October 19. Thus, information changes seem 
necessary to explain the drop, but no such 
information changes can be documented. 

Parallels with the crash of 1929 may be 
useful in understanding the crash of 1987. 
Like 1987, no significant economic news was 
associated with the period immediately sur- 
rounding the earlier crash. Several large 
declines preceded the crash of 1929—as 
they did in 1987. Volatility increased 
markedly in the weeks preceding both drops. 
In both cases, hedging strategies were dis- 
cussed as a possible contributing factor: 
stop-loss orders in 1929 and portfolio imsur- 
ance in 1987. In 1929, stop-loss strategies 
were used for portfolio protection but were 
also triggered by margin calls, which led to 
greater controls over margined stock-btying 
following the crash. 

Because of these similarities, we would 
hope that an explanation of the 1987 crash 
would also be relevant to the 1929 crash. 
The explanation cannot be entirely in terms 
of futures markets and portfolio insurence, 
since neither existed in 1929. 
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In this paper, we develop an explanation 
of market “crashes” that reconciles the 
strands of evidence above. This explanation 
is not based on important changes in infor- 
mation, and therefore it is consistent with 
the failure to observe any significant events 
that directly “caused” the 1987 (or 1929) 
decline. In fact, we define a “crash” to be a 
discontinuity in the relationship between the 
underlying environment and stock prices: an 
infinitesimal shift in information (or other 
small shock) can lead to a major change in 
stock market level. 

Our explanation of crashes 1s based on 
unobserved plans of investors to hedge 
against losses. In 1929, stop-loss strategies 
were used. In 1987, portfolio insurance and 
stop-loss strategies were followed. We de- 
velop a “price pressure” argument akin to 
that of Sanford Grossman (1988a). How- 
ever, this argument must meet two criti- 
cisms: 


(1) How can relatively small amounts of 
hedging drive down prices significantly? 
(2) Why didn’t stock prices rebound the 
moment such selling pressure stopped? 


Our model answers the first question by 
examining the determinants of market li- 
quidity. An important aspect of financial 
markets is that only a small proportion of 
investors actively gather information on fu- 
ture economic prospects or asset supply. 
Other investors look to current prices to 
impute information about future prices. This 
dual role of prices—affecting demand both 
through the budget constraint and through 
expectations—leads to very different price 
elasticities than traditional models, in which 
prices play only the first role. Only recently 
have financial economists begun to explore 


*Such discontinuities are commonly observed in 
physical systems and have been the recent subject of 
study by mathematicians examining “catastrophe the- 
ory.” In a preliminary paper (Gennotte and Leland, 
1987), we considered a simple model of stock market 
discontinuities. 
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the implications for markets in which prices 
play both roles.* 

In such environments, there is an impor- 
tant difference between observed and unob- 
served supply changes. If there are rela- 
tively few informed investors, markets may 
be much less liquid (and therefore more 
fragile) than traditional models predict when 
unobserved supply shocks occur.” We show 
that relatively smell unobserved supply 
shocks can have pronounced effects—more 
than 100 times greater than the effects of 
observed supply shocks—on current market 
prices. 

Unobserved supply shocks have greater 
price impact as a ccnsequence of investors 
inferring information from prices. A supply 
shock leads to lower prices, which in turn 
(since the shock is unobserved) leads unin- 
formed investors to revise downwards their 
expectations. This limits these investors’ 
willingness to absorb the extra supply and 
causes a magnified price response.° 

Our model answers the second question 
by showing how a discontinuity in market 
prices can occur if hedging plans generate 
very large trades. Hedging plans create ad- 
ditional supply as price falls. A small infor- 
mation change can trigger lower prices, 
which, because of hedging, lead to greater 
excess supply and a further fall in prices. 
Thus, a small change in information can 
lead to a dramatic fall in prices, with no 
immediate rebound occurring. This feature 
of crashes distinguishes our results from 


See Grossman (1976), Grossman and Joseph Stiglitz 
(1980), Martin Hellwig (1980), Douglas Diamond and 
Robert Verrecchia (1981), Albert Kyle (1985), and Anat 
Admati (1985). These papers focus on the role of 
prices in aggregating information. 

This point is discussed informally in Leland and 
Rubinstein (1988) and in D. Cutler, James Poterba, 
and Lawrence Summers (1989). Fischer Black (1988) 
describes a model in which shocks to expectations— 
rather than supply—can cause large price changes. 

Such models reflect a rational expectations view of 
Keynes’ famous “beauty contest” metaphor, that suc- 
cessful investors must bass their investments on their 
expectations of others’ expectations of value, rather 
than solely on their own estimates of value. Thus price, 
reflecting others’ expectations, rationally conditions 
each individual investor’s expectations, and bandwagon 
or “herd” effects can result. 


those of Grossman (1988a,b). We demon- 
strate that such a “meltdown” scenario ob- 
tains only for an unrealistically large hedg- 
ing activity when the hedging trades are 
perfectly anticipated. If, however, investors 
(or a fraction of investors) are unaware of 
hedging plans, crashes can occur for much 
smaller levels of hedging activity. The dis- 
continuities arise because investors are un- 
able to perfectly distinguish hedging activity 
from information-based trades and there- 
fore adjust downward their expectations of 
future prices. Imperfect anticipation of 
hedging activity relaxes the rational expec- 
tation requirement, but in a realistic fash- 
ion. Moreover, our estimate of hedging plans 
in place in 1987 approximates the threshold 
at which discontinuities occur in the imper- 
fect anticipation case, providing a potential 
explanation for the 1987 crash. 

Finally, our model suggests that some 
changes in market organization can radi- 
cally reduce the likelihood of crashes. The 
most important such change is increasing 
market knowledge about the size and trad- 
ing requirements of hedging programs. 
Preannouncement of trading requirements 
can lessen the impact of such trades by a 
factor greater than 100. To the extent that 
the specialist’s book helps reveal the nature 
of order flow, this information should be 
made available to all traders. As suggested 
by Grossman, the use of put options to 
implement hedging may also serve to smooth 
markets. 

The model builds from the work of 
Sanford Grossman and Joseph Stiglitz 
(1980), Martin Heliwig (1980), Douglas Dia- 
mond and Robert Verrecchia (1981), Anat 
Admati (1985), Gennotte (1985), and Albert 
Kyle (1985). We postulate a subset of in- 
formed investors who receive noisy signals 
about future market values. Random supply 
keeps these investors from perfectly infer- 
ring the aggregate information from price. 
Some investors, however, whom we charac- 
terize as market-makers, receive informa- 
tion about the size of the random supply. 
This information enables market-makers to 
distinguish, at least partially, selling that is 
information-based from selling that is moti- 
vated by liquidity considerations. We show 
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that market-makers provide a significant 
source of liquidity in meeting random sup- 
ply shocks. 

We also allow for the presence of hedging 
programs such as stop-loss orders and port- 
folio insurance. These hedging strategies are 
usually nonlinear functions of the equilib- 
rium price; hence, the resulting rational ex- 
pectations equilibrium price is, in general, a 
nonlinear function of the signals. We exam- 
ine the effects of these strategies on market 
equilibrium and stability for alternative 
specifications about the observability of 
hedging. The possibility of price discontinu- 
ities, with the implication of crashes, is new 
to our model. 


. I. Informed and Uninformed Investors 


We assume that investors may be in- 
formed or uninformed. The informed in- 
vestors can be subdivided into two types, 
who differ in terms of the signals they are 
able to observe. Thus, in all there are three 
classes of investors: 


(1) uninformed investors (denoted U), who 
observe only the equilibrium price py; 

(2) price-informed investors (I), who ob- 
serve a personal, unbiased signal p/ on 
future price (or liquidation value) p and 
also observe po; 

(3) supply-informed investors (SI), who ob- 
serve a common supply signal § and the 
equilibrium price p,.’ 


The price-informed investors can be thought 
of as having (personal) information about 
economic fundamentals which are noisy 
predictors of future price. Supply-informed 
investors can be thought of as market- 
makers who have information about the 
sources of order flows: the size of new is- 
sues, portfolio restructurings, and other ele- 
ments of liquidity trading.® 


"Investors who are both price-informed and supply- 
informed can easily be incorporated in this framework. 
Adding an I/SI investor has the same impact as adding 
an I and an SI investor separately. 

Our “market-makers” behave competitively, taking 
prices as given. In contrast with Lawrence Glosten and 
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Our model allows arbitrary proportions of 
investors in each class. The relative propor- 
tion of investors who are informed versus 
uninformed is a key determinant of market 
liquidity. Informed investors, particularly 
supply-informed investors, will absorb a 
substantial proportion of liquidity-trading 
demands. Even when they are relatively few, 
informed investors constitute an important 
fraction of the supply of liquidity. 

Thus, an important empirical question is 
the relative number of investors of each 
type. While data on this question are dif- 
ficult to gather, we do have some evidence 
that informed traders, particularly supply- 
informed traders, are relatively small as a 
fraction of total market capital. 

Among supply-informed investors are 
specialists and other market-makers (includ- 
ing “upstairs” desks) who adjust their posi- 
tions in response to changing demand for 
liquidity. Because of their role as market- 
makers, these investors have special infor- 
mation on the nature of demand. Through 
the order book or simply on the basis of 
their knowledge of institutional trading, 
market-makers can learn (perhaps imper- 
fectly) about the volume of noninformation 
(or “liquidity”) trading versus trading based 
on information. 

The funds committed to supply-informa- 
tion gathering (and providing liquidity) de- 
pend upon the return to this activity. In 
some cases, competitive forces will deter- 
mine the amount provided. In other cases, 
institutional factors such as the specialist 
system may limit the number of potential 
entrants. We shall see below that such limi- 
tations can importantly affect the stability of 
markets. 

There is no way to provide an exact quan- 
tification of market-making capital. How- 
ever, it clearly is small relative to the $3.5 
trillion of equity investment. For example, 


the total capital of New York Stock Ex- 


Paul Milgrom (1985), we do not require that all trades 
be completed through the market-makers, or that mar- 
ket-makers are risk-neutral. Our market-makers absorb 
the aggregate excess supply generated by other traders 
at the equilibrium price. We discuss their contribution 
to market liquidity in Section IV. 
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change specialists, including lines of credit, 
is approximately $3 billion (Brady et al., 
1988 p. VI-40). Total capital committed by 
“upstairs” trading houses and other forms 
of market-making may be four or five times 
this number;’ but at $15-$20 billion, these 
supply-informed funds would represent 
about 0.5 percent of total market capital. 

Capital devoted to “price-informed” mar- 
ket timing is even more difficult to estimate. 
It would include funds that explicitly gather 
information about future economic pros- 
pects (“fundamentals”) and engage in mar- 
ket timing strategies reflecting this informa- 
tion. While many funds actively alter their 
exposures to individual stocks, most do not 
actively alter their total stock exposure based 
on information about future economic 
trends, perhaps because long-term success 
stories have been so rare (see Roy Henriks- 
son, 1984); but a few do. The single most 
prominent market timing strategy is “tacti- 
cal asset allocation,” utilized by perhaps $20 
billion of assets. A total of $70 billion, or 
about 2 percent, might be a guess for price- 
informed funds which actively gather infor- 
mation about future prices and trade on it.!° 

This leaves most investors in. the class we 
term “uninformed.” “Passive” might be a 
somewhat less pejorative description of 
these investors, who participate in the mar- 
ket “for the long haul” and do not move in 
and out based on information about funda- 
mentals or current liquidity trading. The 
relative lack of popularity of information- 
based market timing strategies suggests that 
most investors belong to this class. 


Il. Market Equilibrium 


A single risky security is traded. Its future 
price (or liquidation value) p is a normally 


"Conversations with officers at major investment 
banking firms. 

We assume that the informed investors have in- 
formation of value. It is difficult to assess the fraction 
of the market timing based on “quality” fundamental 
information, but it is most probably smaller than the 
fraction engaged in timing in general. Investors trading 
on spurious information would add to the amount of 
random “liquidity” trading. 


distributed random variable with uncondi- 
tional variance $ and unconditional expec- 
tation p. All investors share this prior dis- 
tribution of future price. Current price is 
determined by supply and demand. Riskless 
bonds are also traded, and the riskless rate | 
is zero.'! 


A. Supply 


The net supply of stocks is a fixed amount 
m, modified by two additional factors: 


(1) A random and exogenously determined 
net supply created by “liquidity traders.” 
This shock is composed of two pieces: 
an unobserved liquidity shock, L, dis- 
tributed N(0, >, ); and a liquidity shock, 
S, distributed N(0,%.), which is ob- 
served by all supply-informed inves- 
tors. S and L are assumed to be inde- 
pendently distributed. 

(2) A deterministic supply by hedgers, re- 
balancers, and others who utilize dy- 
namic strategies akin to portfolio insur- 
ance. The supply of stock from these 
strategies is a known function of the 
current market price py. We denote this 
supply by m, a decreasing differentiable 
function of p,. This hedging demand 
will be observed by different market 
participants in the alternative environ- 
ments that are considered below. 


Thus, the total supply is 


m+L+S+a 


or 
(1) m +1 


where m=m+L+ 5S. 


HAN the results extend in a straightforward way to 
the case of nonzero interest rates. 

An equivalent formulation would allow the 
supply-informed investors to receive a noisy signal 
about total liquidity trading and not distinguish $ from 
L. A simple transformation of variables allows the 
alternative interpretation. 
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B. Demand 


As discussed above, there are three classes 


j of investors characterized by the informa- 


tion signals (if any) they receive. All in- 
vestors maximize expected utility of termi- 
nal wealth over a single period. Preferences 
are assumed to exhibit constant absolute 
risk aversion. The utility function of each 
investor in class j is a function of terminal 
wealth W and is given by 


UW) = —exp(- W /a,). 


Expectations depend upon the signals that 
investors observe. Each price—informed in- 
vestor i observes p/ = p + £; where p is the 
true’ future price and g; denotes a noise 
term uncorrelated with other random vari- 
ables and uncorrelated across price-in- 
formed investors. Both p and g; are as- 
sumed to be normally distributed, as 
N(p,%) and N(O,>,), respectively." Sup- 
ply-informed (SI) investors receive a com- 
mon signal, S, where S is the liquidity sup- 
ply, distributed N(0,%,) and observed only 
by SI investors. All investors can observe 
the current market price and use this (and 
their other information) to determine their 
conditional distributions of future price. 

All investors in class j have the same 
conditional variance Z, for future price. The 
expectation of the future price conditional 
on the information available to an agent i 
belonging to class j is denoted p,. It is well 
known that, given exponential utility and 
normality, the portfolio optimization prob- 
lem leads to a demand for shares by in- 
vestor į in class j equal to 


i -ilsi _ 
There are w, investors of type I Pemand 


per investor in class j, n;= Xn / W; 
equal to 


n;=a;Z (B; Po) 


For simplicity and notational convenience, we as- 
sume that the g; are Lid. across agents. Differences in 
information precision would not affect our results. 
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where p, is the mean expected future price 
for investors in class j. All supply-informed 
investors observe the same signals and hence 
have the same conditional expected price. 
This also is the case for uninformed: in- 
vestors. Price-informed investors receive 
different signals, ;. However, as their.num- 
ber increases, the mean expected future 
price converges to the actual future price p 
by the law of large numbers." 

The relative market power: of investor 
class j, k,, is defined as the ratio. of the 
weighted risk tolerance of the class to the 
sum of the weighted renee: 


k; =a,w, Lay, 


Define the normalized total demand: D as 
the sum of'the individual classes’ demands 
divided by the weighted sum of the three 
classes’ risk tolerance, wa; 


Yee Dae 
Ln, - 
2) Da sm Eke (B— Po) 
JJ j 


j 


Similarly, the supply parameters: ar, m, S, 
and L are normalized by dividing the origi- 
nal parameters by U,w,a,;; we do, however, 
keep the same notation. Our analysis fo- 
cuses on relative proportions and is thus 
unaffected by this normalization. 


C. Equilibrium 


Equilibrium of supply (1) and demand (2) 
yields the equation for equilibrium price: 


Zipo t+ arh Po) = L k;Z p-m 
Í 


14 Hellwig (1980) shows the error terms e; do cancel 
in the limit of a sequence of finite economies where the 
relative proportion of investors’ in each class remains 
fixed and where the total number of investors and the. 
supply parameters grow without bound at the same 
rate. Individual agents are thus price-takers and, im- 
portantly, the individual error terms s; do not affect 
prices. 


\ 
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where Z~! is E,k;Z; *. We may now char- 
acterize the equilibrium price function re- 
lating current price to future price (as re- 
vealed by the average of individual signals 
p;), unobserved liquidity supply (L), and 
observed supply (S). 

In our postulated environment, with all 
investors cognizant of hedging supply m( po), 
we can show the following. 


THEOREM 1: There exists a rational expec- 
tations equilibrium CREE) of the form 


(3) Po=f(p- P- HL—IS) 


where f(-) is a correspondence and where H 
and I are real constants that depend only on 
the agents’ relative market power and on the 
means and variances of the random variables. 


The proof and all derivations, as well as 
expressions for f (+), H, and J, are pro- 
vided in the Appendix. 

The price correspondence f(-) in Theo- 
rem 1 can be discontinuous and multival- 
ued. We can, however, characterize pre- 
cisely the situations in which crashes can be 
ruled out as follows. . 


PROPOSITION 1: fC) is a continuous 
function if and only if Z~'py+(p,) is 
strictly monotonic in po. 


PROPOSITION 2: In the absence of hedgers 
_[w(p,))= 0], fC) is a continuous function, 
and no “crashes” can occur. 


In the absence of hedgers or if the hedging 
supply is a linear function of equilibrium 
prices, f is a linear function [equation (A4) 
in the Appendix]. The REE function f is 
nonlinear if the hedging supply is a nonlin- 
ear function of equilibrium prices pọ. Even 
though current price pọ is not normally 
distributed due to the nonlinearity of f(-), 
investors recognize that a simple transfor- 
mation of pọ, namely f~'(p,), is normally 
distributed and can be used to condition 
expectations of future price p. In the fol- 
lowing two sections, we examine aspects of 
equilibrium price behavior in the absence of 
hedgers. 


Ill. The Nature of Equilibrium Pricing: 
An Example 


Consistent with our earlier discussion, we 
consider an example in which there are 
relatively few price-informed and supply- 
informed investors. We assume that 0.5 
percent of investors (market-makers) are 
supply-informed and that 2 percent of in- 
vestors are price-informed. There is no 
hedging supply 7; this will be introduced in 
Section IV. 

Several other parameters must be speci- 
fied before the model is complete. A key 
parameter is the quality of the information 
signal received by the price-informed: in- 
vestors. The better the signal, as expressed 
by the signal-to-noise ratio, the lower the 
conditional variance Z, for the price- 
informed investors. 

We assume that the quality of the signal 
received by each price-informed investor is 
not very high. Specifically, we assume that 
the price-informed investors’ signal-to-noise 
ratio is 0.2. Thus, if } is the ex ante vari- 
ance of future price and $, is the variance 
of each price-informed investors’ future 
price signal, then %, is five times $. This 
assumption implies that Gin equilibrium) the 
price-informed investors’ conditional stan- 
dard deviation for future price is 19.1 per- 
cent, rather than the 20 percent of unin- 
formed investors, who observe price only. 
This slight improvement seems consistent 
with the perceived difficulty in predicting 
future market prices.’ 

Also important is the fraction of total 
liquidity-supply shock that, on average, can 
be observed by supply-informed investors. 
Since $ ts observed and L is not, this frac- 
tion can be parameterized by the ratio 
X>/ =, If the ratio is high, then supply- 
informed investors on average will observe 
most of the total liquidity shock. We assume 
that the ratio is 1: On average, supply- 


‘While each individual signal about future price is 
quite noisy, the average signal perfectly reflects future 
price, as in Hellwig (1980). But individual investors 
cannot “back out” the true future price from current 
price, because supply is noisy. 
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informed investors receive a signal that re- 
veals information about half the total liquid- 
ity-supply shock. Conditional on the supply 
signal, a supply-informed investor estimates 
a 19.2-percent standard deviation for future 
price. 

Our example is consistent with a linear 
rational expectations price function as in 
Theorem 1. We choose $, the ex ante vari- 
ance of future price, and m, the fixed sup- 
ply, to equate the expected return on the 
risky security to 6 percent and the standard 
deviation (conditional on current price) to 
20 percent.° Finally, we find a variance of 
supply &,, such that the variance of pp, the 
current price, is equal to the variance of the 
future price p, conditional on po. This pro- 
vides the example with intertemporal con- 
sistency: expected future price volatility 
(given current price) equals current price 
volatility. Parameters for the example are 
summarized in the Appendix. 

The rational expectations equilibrium 
price function (3) for our example is 


(4) po=0.5( p—p-19.95 L—8.14 S) +1. 


Given this price function and the volatilities 
of future price and liquidity surprises, the 
standard deviation of pọ is 20 percent, as is 
the standard deviation of p conditional on 
Po. The example is chosen to reflect “rea- 
sonable” parameters when the model’s sin- 
gle risky security is interpreted as the stock 
market portfolio and will be used in subse- 
quent sections to illustrate aspects of mar- 
ket behavior. 


IV. Stock Market Liquidity 


Because of the Brady Commission’s focus 
on limited stock market liquidity, we exam- 
ine the impact of changes in supply on 
market price. We postulate a small percent- 


‘Recall that the interest rate is normalized to 0. 
Thus, the assumed return of 6 percent represents a 
6-percent premium over the riskless interest rate, The 
risk and excess return of the risky security in our 
example are consistent with the long-term risk and 
excess return of the stock market as estimated by 
Ibbotson Associates (1985). 
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age change in supply and determine the 
resulting percentage change in the equilib- 
rium price from the pricing relation (3). 
This will then determine the (inverse of the) 
price elasticity of the market. We interpret 
greater price elasticity as a more liquid mar- 
ket. In this section, we continue to assume 
that there is zero net hedging: m = 0. 

We study three possibilities: supply in- 
creases, and 


(1) the increase in supply is known to all 
investors; 

(2) the supply-informed investors (only) re- 
ceive an accurate signal about the in- 
crease in supply; 

(3) no signal is received by the supply- 
informed investors (or anyone else). 


The first possibility is modeled by letting the 
expected supply m change. A change in 
expected supply will be common knowledge 
and will not affect investors’ expected future 
price. From equations (3) and (A4) in the 
Appendix, 


5m 
mi pP 
Elasticity = — a = ae 
0 
Po 


Given the example (4), we find an elastic- 
ity of 17: a 1-percent observed supply in- 
crease will lead to a 0.06-percent fall in 
price. Such a high elasticity is very much in 
line with the predictions of traditional mod- 
els, which do not postulate that investors 
learn from market prices. Investor classes 
participate proportionately to their market 
power k, in absorbing the increase in sup- 
ply. 

The second possibility is modeled by a 
small increase in the random supply $ which 
is observed only by the supply-informed in- 
vestors. In this case, 


bs 

ieee p 1 

Elasticity = — T = a FI 
Po 


Po 


} 
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where F is the slope of the price function f, 
which is linear in our example. . 

In the example (4), we find an elasticity of 
0.16: a 1-percent partially observed increase 
in supply will lower price by 6 percent. 
Remarkably, this is only 1 percent of the 
elasticity above. This is because investors, 
with the exception of the supply-informed 
investors, revise downward their expecta- 
tions (which are conditional on pọ) as price 
falls. Thus, they are less willing to absorb 
the increased supply. Indeed, in our exam- 
ple, the supply-informed investors absorb 
about 54 percent of any increase in liquidity 
supply, which they observe, even though 
they constitute only 0.5 percent of invest- 
ors. 

Price-informed traders, who represent 2 
percent of investors, absorb another 18 per- 
cent of the increase in liquidity supply. They 
are more willing to buy as prices fall be- 
cause (on average) they receive signals about 
future price that moderate the fall of ex- 
pected future price. Uninformed investors 
have no such signals and can only infer from 
current price. Because they impute a lower 
future price as current price falls, they ab- 
sorb but 28 percent of the increased supply, 
despite the fact that they constitute 97.5 
percent of investors. 

The final possibility is modeled by an 
increase in L. In this case, elasticity falls 
even further, since the supply-informed 
traders will not observe the increase in sup- 
ply and will not increase their demand. From 
(3), we can determine that 


ôL 
prea 1 
Elasticity = - 7 = 2 e. 
0 m 
Po 


The exponential utility model does not limit pur- 
chases by investors to their initial wealth. If we im- 
posed a “no leverage” condition, elasticity in this case 
would be even lower. This is because in our equilib- 
rium, supply-informed investors will buy tremendous 
amounts of stock (on a per capita basis) when prices 
fall. This would only be possible if they can undertake 
levered stock positions. 


In the example (4), elasticity will be 0.07, 
or about 1/250 of the elasticity predicted by 
traditional portfolio/consumption models. 
A 1-percent unobserved increase in supply 
will lower prices by 14 percent. In this case, 
price-informed investors will absorb about 
40 percent of the supply increase, and unin- 
formed investors absorb the remaining 60 
percent.'® But price must fall precipitously 
to induce them to absorb the extra supply. 

The model therefore resolves the paradox 
of low versus high demand elasticity. If sup- 
ply changes are unobserved, all investors 
will revise downward their expected future 
price and will absorb the increased supplies 
only after price has fallen substantially. 
Price-informed investors will have some- 
what greater elasticity of demand than unin- 
formed investors, since they receive inde- 
pendent information about future prices. 
However, their contribution will be minimal 
if they are few, or if their price information 
is very noisy. 

How supply-informed investors, or mar- 
ket-makers, contribute to market liquidity 
(and therefore to price volatility) depends 
on the quality of the supply signal they 
observe. When their signal has low preci- 
sion, they side with the uninformed in- 
vestors, who always sell when prices rise 
and always buy when prices fall. Their aver- 
age selling price therefore is higher than 
their average buying price, providing a posi- 
tive spread much like the uninformed mar- 
ket-maker in Lawrence Glosten and Paul 
Milgrom (1985). Because their actions are 
not aggressive and their numbers are rela- 
tively small, market-makers with poor sup- 
ply information will reduce volatility only 
marginally. 

When market-makers receive more pre- 
cise information about the extent of liquid- 
ity trading, their behavior changes. They 
ageressively take the other side of observed 
liquidity trades, thereby reducing the price 
volatility associated with liquidity shocks; 


‘8Supply-informed investors play little role in this 
scenario, since they observe S = 0. In fact, they actually 
sell (a small amount) rather than buy, since observing 
S = 0 implies that price information is more likely to be 
negative, given the fall in prices. 
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but supply-informed investors will trade in 
the same direction as informed traders when 
their information indicates that there is lit- 
tle liquidity trading. This behavior tends to 
accentuate price moves related to informa- 
tion and suggests that a further examination 
of the market-makers’ role is warranted.” 
Only the supply-informed market-makers 
will have a high elasticity of demand and 
the consequent ability to absorb liquidity 
trades. But these participants are relatively 
few in number and risk aversion ultimately 
limits their ability to absorb liquidity selling 
without a substantial price drop. In sum, 
traditional models will grossly overestimate 
the liquidity of financial markets, unless all 
investors observe the increase in supply. 


V. Hedging Strategies and Market Stability 


We now consider the situation when there 
is hedging activity in the market. Hedgers 
sell as stock prices fall. They do so to pro- 
tect themselves against further potential 
losses. Whether hedge programs are carried 
out by portfolio insurance programs or by 
less formal means such as stop-loss orders, 
the result is the same: supply increases as 
prices fall. This selling must be absorbed by 
other investors. . 

It is generally believed that hedge pro- 
grams can make markets more volatile. But 
can they lead to a crash or “meltdown,” in 
which selling begets selling and prices 
plunge: without stop? Phrased more for- 
mally; can the function relating price to 
underlying information become discontinu- 
ous? 

We examine these questions by adding a 
hedging supply Gr) to a market that previ- 
ously did not have such a supply. At the 
initial equilibrium price (p,=1), we nor- 
malize hedging supply to be zero. For prices 
below pọ=1, the hedging supply will be 
positive; for prices above, negative. We as- 
sume that the hedging supply is a continu- 
ous and differentiable function of po. 


Perhaps recognizing the incentives of supply- 
informed market-makers to accentuate price move- 
ments by their own trading activities, exchange rules 
require specialists to maintain “orderly markets.” 
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The key to the stability of markets is the 
extent to which hedging strategies are ob- 
served by investors. Parallel to our discus- 
sion of market liquidity, we consider three 
cases: all investors observe the hedging sup- 
ply function m; only supply-informed in- 
vestors observe hedging; and no investors 
observe hedging. In the first case, agents 
have perfect knowledge of the market struc- 
ture; in the others, some agents underesti- 
mate hedging activity.” 

Theorem i characterized equilibrium 
when all investors can observe the hedging 
supply functicn a(-). We can further show 
the following. 


PROPOSITION 3: When all investors ob- 
serve hedging supply m, (i) the current equilib- 
rium price will be mare volatile than when 
there is no hedging supply, and (ii) the equi- 
librium price function can be discontinuous 
when hedging supply is sufficiently large. 


Although discontinuities and, therefore, 
crashes can occur with full observability of 
mT, we Shall show that crashes are unlikely in 
this environment. l 

We now characterize equilibrium when 
hedging is partially observed or unobserved. 


THEOREM 2: If hedging supply T(-) can 
be observed orly by supply-informed investors 
or is totally unobserved, there exists a price 
equilibrium 


Po=f(p-p- AL—IS) 


Through time, investcrs might learn of the exis- 
tence of hedgers. Modeling this would require a multi- 
period framework. However, the amount of hedge 
trading typically is small if the price level is not “too” 
close to the critical points, making it difficult to infer 
the extent of hedging. Further, when hedgers first 
enter the market, there is no mechanism whereby 
uninformed traders could immediately recognize their 
presence. An extension of our model might allow in- 
vestors to have some prior probability distribution of 
hedging supply given current price and use observed 
prices to infer the likelihocd of hedging activity as well 
as liquidity shocks and future.price information. We 
hope to explore this more complex problem in subse- 
quent work. 
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where f(-) is a correspondence that depends 
upon the extent of observability, but H and I 
are real constants which are identical to those 
when hedging supply is fully observed. 


Partial observability or nonobservability 
therefore does not affect the argument of 
fC) but does affect its functional form. In 
the context of Theorem 2, some investors 
do not observe hedging supply and thus 
make an erroneous assumption on the func- 
tional form for equilibrium prices. This in- 
correct assumption is reflected in the func- 
tional form of actual equilibrium prices. The 
agents who are aware of hedging plans, 
however, know the actual functional form. 
For the cases in which some agents are 
aware of hedging plans, the same equilib- 
rium would obtain if the agents were un- 
aware of hedging plans but were able to 
identify the hedging trades as liquidity (i.e., 
information-free) trades. This property ex- 
plains why the difference in the excess de- 
mand equations (5) below amounts to adding 
hedging activity a to the expected supply m 
[in (5i)], to the observed liquidity trades S$ 
[in (Sii)], and to the unobserved liquidity 
trades L [in (Siii)]. 
We further show the following. 


PROPOSITION 4: When only supply- 
informed investors observe hedging activity m, 
(i) the current price will be more volatile than 
when all investors observe a similar amount of 
hedging activity, and (ii) the equilibrium price 
can become discontinuous at a lower level of 
hedging activity than when all investors ob- 
serve hedging. 


PROPOSITION 5: When ail investors are 
ignorant of hedging activity m, (i) current 
price is even more volatile than when only 
supply-infoermed investors observe hedging ac- 
tivity, and (ii) discontinuities can occur at 
even lower levels of hedging activity. 


The maximal hedging level before price dis- 
continuities or “crashes” can occur thus de- 
pends critically on whether hedging is ob- 
served. It also depends upon the nature of 
hedging strategies. We assume that a frac- 
tion w of assets are protected by a put- 


option replicating strategy.” The supply 


created by this portfolio-insurance hedging 
strategy will depend on the current stock 
price py. The incremental hedging supply 
when future price is pọ, relative to the 
supply at the initial equilibrium price 
(Po = 1), is given by 


w= w{N[d,(1)] -—N[4,( 29) ]} 


where æ is the fraction of assets subject to 
the hedging strategy, N(-) is the cumulative 
normal distribution function, and d, is given 
by the Black-Scholes formula 


where K is the striking price of the option 
and o is the standard deviation of p condi- 
tional on py.” Note that m'( po), the deriva- 
tive of the hedging supply with respect to 
Po, is negative for large pọ and becomes 
more negative as pp falls, before eventually 
approaching zero as Pp, falls to zero. 

With the three alternative specifications 
above ior f(-) depending on observability, 
we can derive the excess-demand functions 
aS po Varies. From the Appendix, the equa- 
tions for excess demand are given by 


1 
(Si) XDa=—|p~-B-HL~IS 


Z~'(D- po) -(m+7) 
= ) i ae 


i Put-replicating strategies are just one possible type 
af hedging. Others might include stop-loss, “constant 
proportion of surplus” policies, or do-it-yourself strate- 
gies. We examine put-option replication because it was 
the most prevalent of formal protection strategies on 
October 19, 

*2See Black and Myron Scholes (1973). Our formula 
assumes that the interest rate has been normalized to 
0, and assumes a one-year time horizon. Note that the 
Black-Scholes hedge replicates a put option when fu- 
ture price p follows a lognormal process; our model 
presumes that p is normally distributed. 
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when all investors observe m, 
+. 1 

(Sii) XDp= 42-5- ML Is 


Z-'(p- po) -m I 
Z lgl g 


when only supply-informed investors ob- 
serve m, and 


p-p—-HL-—IS 





Í 
Sii) XDy=— 
(Siii) U" Fy 


Z~'(P-po)-™ 
Zit | 


when no investors observe 7r. 

These three functions are graphed in Fig- 
ure 1 for the parameters in our earlier ex- 
ample, with 5 percent of investors following 
a put replicating hedge strategy with the 
protected level being 90 percent of initial 
price (w = 0.05, K = 0.9). Note that the fully 
anticipated excess-demand function is the 
flattest; neither it nor the partially antici- 
pated excess-demand function is “backward 
bending.” However, the unanticipated ex- 
cess demand function is backward-bending. 
The three curves in Figure 1 intersect at 
Po = 1. Thus, in the absence of future price 
or liquidity shocks, the price py =1 is an 
equilibrium in all three cases. 

Now, suppose that information signals 
about future price become slightly more 
pessimistic: p— p= —0.01, a 1-percent 
downward shock. This will cause demand to 
fall slightly, thereby shifting all three curves 
to the left by the same small amount. Figure 
2 depicts this shift. 

To restore equilibrium, price will fall in 
all three cases, until excess demand again is 
zero. The excess-demand curve when hedg- 
ing is unobserved has the steepest slope: the 
resulting price drop (2.7 percent) to restore 
equilibrium will be greater than the price 
drop (0.7 percent) to restore equilibrium in 
the partially observable case, which in turn 
will be greater than the price drop (0.5 
percent) to restore equilibrium in the fully 
observable case. 
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FiGurE 1. AGGREGAT= Excess DEMAND IN THE 
ABSENCE OF FUTURE PRICE OR LIQUIDITY SHOCK. 
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FIGURE 2. AGGREGATE EXCESS DEMAND WITH 
1-PERCENT DOWNWARD SHOCK. SYMBOLS: 
O = UNOBSERVED; © = PARTIALLY OBSERVED; 
A = ANTICIPATED 
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In short, the market price is more volatile 
in response to future price shocks when 
hedging supply is unobservable. It will also 
be greater in all cases if w, the proportion 
of hedgers, becomes larger. When hedging 
activity is unobserved, volatility increases 
because investors believe a change in funda- 
mentals is more probable, creating a magni- 
fied price response. 


A. Prelude to a Crash 


We continue to examine market behavior 
as information about future price becomes 
(continuously) more pessimistic. Figure 3 
indicates the situation for p— p= —0.016. 
Relative to our initial equilibrium (at po 
= 1), the average of future price signals is 
now 1.6 percent more pessimistic than the 
case in Figure 2. Of course, price must drop 
further to restore equilibrium. This in turn 
creates further portfolio insurance selling. 

How far does price drop? This depends 
on how completely portfolio-insurance sell- 
ing is observed. If every investor observes 
aw, the price falls from 1 to 0.992, or 0.8 
percent; if it is observed only by the supply- 
informed, price falls by 1.2 percent; but if 
no one can observe m, price will fall 7.25 
percent in response to the signal (— 0.016), 
almost t n times as far as when 7 is fully 
observed. 

Note that, in the case of unobserved r, 
the market also becomes more sensitive to 
future price signals. This can be seen in 
Figures 2 and 3 by noting the fact that 
excess demand is becoming a steeper func- 
tion of py. Thus, volatility of the market is 
increasing as po falls. 

The move from the situation in Figure 1 
to the situation in Figure 3 seems to corre- 
spond to the steady erosion of confidence 
that occurred during the month leading up 
to October 19, 1987. As the Brady Commis- 
sion Report documented, a number of neg- 
ative economic trends came to light during 
this period: interest rates were rising; the 
dollar was weakening; tensions in the Mid- 
dle East were increasing; and so on. In our 
model, this ts reflected by a sequence of 
negative signals about future price. 

As the market fell, portfolio-insurance 
programs became more active. At higher 
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FIGURE 3. AGGREGATE Excess DEMAND WITH 
1.6-PERCENT DOWNWARD SHOCK. SYMBOLS! 
O = UNOBSERVED; © = PARTIALLY OBSERVED; 
A = ANTICIPATED 


market levels, not much hedging was neces- 
sary, given the relatively low levels of pro- 
tection chosen by many pension funds. As 
the market fell closer to the desired protec- 
tion level, greater hedging was needed, and 
the market became more volatile. Yet al- 
though portfolio insurance was beginning to 
attract some public attention, it was largely 
unknown to the majority of investors and 
not fully understood even by market profes- 
sionals. It was Friday, October 16, 1987. 


B. The Crash 


Figure 3 shows the market at a critical 
point when hedging strategies are not ob- 
served (such as on October 16). Prior to 
October 16, 1987, prices had fallen strongly 
over the previous several trading sessions, 
with great volatility. Over the weekend, a 
bit more negative news came into the mar- 
ket—nothing earthshaking, but enough to 
shift the backward-bending excess-demand 
curve a fractional amount further to the 
left. 
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FIGURE 4. AGGREGATE Excess DEMAND WITH 
1.8-PERCENT DOWNWARD SHOCK. SYMBOLS: 
0 = UNOBSERVED; © = PARTIALLY OBSERVED; 
A = ANTICIPATED 


Figure 4 illustrates the situation as it may 
have been on Monday morning, October 19, 
1987. The value of p—p has fallen to 
—0.018, slightly below its previous level. 
The marginally negative news over the 
weekend, coupled with further portfolio- 
insurance selling (including some resulting 
from Friday’s decline) led to rapidly falling 
prices. 

Observing these falling prices, unin- 
formed investors (rationally) concluded that 
highly negative information must have been 
received by the price-informed investors. 
(Indeed, the following day’s newspapers 
vainly sought the information event which 
“must” have triggered the crash.) As re- 
ported by Robert Shiller (1987), the major- 
ity of investors stood on the sidelines or 
bought only limited amounts, consistent with 
a conviction that something unknown but 
terrible must have happened. Investors sur- 
veyed by Shiller reacted more to the crash 
itself than to outside news. Meanwhile, 
hedgers were selling ever larger amounts. 

As Figure 4 shows, excess supply actually 
increased as prices began to fall, leading 
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them to fall even further. The feared melt- 
down was actually happening. Only when 
hedgers had largely completed their selling 
did the market stabilize, but at a much 
lower level. In Figure 4, our example shows 
a postcrash equilibrium price of pọ = 0.64: a 
30 percent drop from its previous closing 
price in Figure 3. While the market on 
October 19 did not fall quite this far, it also 
is the case that many hedgers scaled back 
the size of their hedging programs in the 
face of extraordinarily high transactions 
costs.” 

A similar story could be told about the 
1929 crash. The only difference is that port- 
folio-insurance hedging would be replaced 
by stop-loss hedging. While less exact in 
delivering desired results, stop-loss orders 
have the effect of increasing liquidity supply 
as prices fall. It is no accident that investi- 
gators focused on the role of stop-loss or- 
ders and margined stock buying, since the 
latter forced additional stop-loss selling as 
the market descended. 

It should be emphasized that a crash in 
our model is not due to a discontinuous 
change in the underlying information. 
Rather, the market reaches a critical point, 
and a “catastrophe” occurs, both in practice 
and in theory. While hedging strategies 
are an important part of our explanation of 
the crash, equally important is the market 
structure which precludes observing these 
hedging strategies. Figures 1-4 also plot the 
excess-demand functions associated with 
partial or complete observabitity. These 
“regular” Q.e., not backward-bending) ex- 
cess-demand functions eliminate the possi- 
bility of crashes in our example. Indeed, if 
m programs had been fully observable, 
prices would have fallen a modest 1 per- 
cent; and the fall would have been about 1.5 
percent if supply-informed investors (only) 
had observed the extent of 7r sales. 


For a description of how hedging programs were 
modified in the presence of high trading costs, see 
Leland (1988). 

“Hal Varian (1979) discusses catastrophe theory 
and its relation to economic models. Our discantinuity 
represents a “cusp catastrophe,” as discussed in Sec- 
tion VI. 
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This is not to say that crashes are impos- 
sible with partial observability. If we had 
assumed a 15-percent use of portfolio insur- 
ance (w=0.15), the excess-demand curve 
with partial observation would be back- 
ward-bending. A 15-percent use would rep- 
resent over $500 billion, or more than five 
times the total amount estimated for formal 
programs. Shiller’s survey suggested that 
formal portfolio-insurance programs were 
“the tip of the iceberg” relative to total 
hedging, so it is possible that the crash 
could have occurred in our example even 
with supply-informed traders aware of hedg- 


ing supply. 
C. After the Fall 


A new low-price equilibrium is estab- 
lished after the crash. If information about 
future prices now reverses itself, returning it 
to precrash levels of optimism, will the mar- 
ket rebound to its former level? The answer 
is no. In Figure 4, a small rightward shift of 
the excess-demand function will lead to a 
small increase in equilibrium price pọ from 
the 0.64 level. Even if the upper branch of 
the excess-demand curve intersects the 
zero-excess-demand line, implying the pos- 
sibility of multiple equilibria, the lower 
equilibrium price is locally stable and can 
be expected to prevail. Since the slope of 
the excess-demand curve is less steep at 
Po = 0.64 than just before the crash (when 
Po = 0.9275), price volatility will return to 
lower levels. 

Eventually, if information becomes still 
more favorable (to p — p = 0.026, well above 
precrash levels) and if the hedging function 
m remains the same as before the crash, the 
excess-demand curve will shift sufficiently to 
the right such that its lower branch is just 
tangent to the vertical zero-excess-demand 
line (see Fig. 5). This will be accompanied 
by higher volatility. Any further increase in 
future price expectations could lead to an 
upward jump in prices: a “meltup” rather 
than a meltdown. In our example, the dis- 
continuous jump would commence at py) = 
0.74 (15 percent above the market low) and 
jump to pọ = 1.043. 

Perhaps such an upward jump is possible 
only in the mind of the theorist. However, 
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FIGURE 5. AGGREGATE Excess DEMAND WITH 
2.6-PERCENT UPWARD SHOCK. SYMBOLS: 
O = UNOBSERVED; © = PARTIALLY OBSERVED; 
A = ANTICIPATED 


over the period 1928-88, 22 of the 38 one- 
day stock market moves that exceeded 7 
percent were upward jumps, and the finan- 
cial press occasionally remarks on such a 
possibility (see Anise Wallace, 1989). 

Figure 6 graphs the equilibrium price 
function relating pọ to the future price sur- 
prise, p — p, for the three different observ- 
ability cases. For the case with unobserved 
qr, we see that the point of discontinuity on 
the upper branch of the function is at py) = 
0.927, and the discontinuity on the lower 
branch is at py = 0.740. For the other two 
cases, there are no discontinuities given our 
example’s parameters. 

Future price surprises are not the only 
possible sources of discontinuous price be- 
havior. A random liquidity-supply shock 
could also lead to discontinuous behavior. 
But whatever the cause, the critical price 
(i.c., where the discontinuity occurs) will 
remain the same. This leads us to examine 
the general nature of critical points: when 
do they occur, and what determines their 
level? 


1014 _ THE AMERICAN ECONOMIC REVIEW 


TANSEL TICS 





-B.2 -08.1 {i} 8.1 8.2 


Surprise 


FIGURE 6. EQUILIBRIUM PRICE AS A FUNCTION OF 
p— E({p). SYMBOLS: — = UNOBSERVED; 
© = PARTIALLY OBSERVED; A = ANTICIPATED 


VI. Discontinuities: Some General Results 


We now characterize price levels at which 
the price function becomes discontinuous 
and the minimum amount of hedging with 
which a “crash” can occur. These critical 
points depend upon the extent to which 
hedging can be observed. 

First consider the case in which hedging 
strategies are unobservable: investors are 
unaware of hedging strategies and thus do 
not distinguish them from unobservable 
liquidity trades. The equilibrium price py is 
the price level for which excess demand is 
equal to zero [eq. (iii)]. 

Discontinuities will occur if the root (or 
roots) of (Siii) are discontinuous functions 
of the variables p, L, and S. Since excess 
demand is continuous and differentiable in 
Po, discontinuities will take place at points 
where the function reaches an extremum. 
Differentiating (Siii) with respect to pg 
yields 





ôX Dy | 1 ) 
ôPo 
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where F=1-Z/%>0 is the coefficient 
that obtains in the case of no hedging (agents 
are unaware of hedging in this case) and 
where i 





DoF 


The derivative of the demand for hedging 
(ar') tends to zero as prices become large 
and as prices become small; hence, the 
derivative of excess demand is negative at 
both very high and very low prices. 

The equilibrium price function will be 
discontinuous if and only if there exists a pp 
such that (1+ FH’) < 0. This implies prices 
at which the excess-demand curve is back- 
ward-bending and also implies that 


(6) 1+ FH’ =0 


admits a solution. Then, since m has a 
unique inflection point in this case, equa- 
tion (6) has two solutions, the critical prices 
c, and c, (c,>c,). Excess demand is an 
increasing function of equilibrium price po 
in the interval (c,,c,) and decreasing else- 
where. It can also be shown that the first 
critical point, c}, decreases as FH increases 
and the second, c, increases as FH in- 
creases. Equation (6) has two roots if and 
only if ) 


1/2 


li 


@FH > Ke~P/"(270?) "= p min 


implying 
Onin 7 (FH) T Amin- 


The root is unique (and there is no disconti- 
nuity) when equality obtains. w,;, repre- 
sents the largest proportion of hedgers for 
which a crash does not occur. Note that 
W min also is the upper bound of w for which 
the inverse price function f~1(p,) is mono- 
tonically increasing. 

The critical prices, c, and c,, are given by 


(7) C= Ke —207, —[207 In(wFH/ min) 


C, = Ke? la? n@FH/¢ min > 
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Ficure 7, EQUILIBRIUM PRICE AS A FUNCTION OF 
FUNDAMENTALS AND THE NUMBER OF HEDGERS 


The difference c, — c} is the range of prices 
Po for which no stable equilibrium exists; 
however, the amount of the price drop when 
c, is reached from above is larger than this 
difference. In our base case, the discontinu- 
ity occurs for a value of @,,;,, of 0.425, which 
is reached for œin = 4.26 percent. This per- 
centage of hedgers can create a market crash 
in our example. Conversely, the market 
meltup takes place when the equilibrium 
price reaches c, from below and the price 
jumps to a level higher than c, (see Fig. 5). 
Figure 7 graphs equilibrium price as a func- 
tion of information signals and the fraction 
of hedgers w. The graph indicates a “cusp 
catastrophe”: for low values of w, there is a 
unique price equilibrium for each signal re- 
alization; for higher values, three distinct 
equilibria exist in the “fold” area. The fold 
points correspond to the critical points of 
Figure 6. The cusp point corresponds to 
® mine Note that the interval (c,, c3) corre- 
sponds to the interval over which the func- 
tion pọ + FHC po) is decreasing in pg and, 
therefore, is the range of prices over which 
the inverse price function f-*(-) is decreas- 
ing and multiple price equilibria exist. 

Now consider the case in which there is 
partial observation of hedging strategies: 
supply-informed investors can observe the 


sum of S and 7. The same reasoning leads 
to an equation analogous to (6): 


1+ FIr’'=0 
implying 
@ min ~ (FT) "PO min: 


The critical points are obtained by substitut- 
ing J for H in (7). Since H > J in all cases, 
the minimum fraction œin (10.4 percent in 
our example) is higher than in the previous 
case, c, is larger, c, smaller, and ceteris 
paribus the price drop is smaller. l 

Finally, in the fully observed case, identi- 
cal results obtain provided that FH is re- 
placed with Z: discontinuities require 
1+ Zr’ <0. Note that this does not appear 
to be a major restriction: w would have to 
be enormous (over 100 percent).*° 

In summary, crashes are most likely to 
occur in the unobserved case, since the in- 
equality is satisfied for the lowest values of 
@min: Because the critical-point difference 
C,—C, is greatest in this. environment, the 
“crashes” associated with this environment 
will also be the largest. 


VII. Making Markets More Stable 


Our analysis suggests that unobserved 
hedging strategies can destabilize a market, 
leading to greater volatility and ultimately 
to a crash. Are there private or governmen- 
tal policies that would lessen the chances of 
such an event in the future? 

Outlawing hedging strategies is one such 
possibility, but it is neither practical nor 
desirable. It is not practical because it is not 
enforceable. An investor following a stop- 
loss or portfolio-insurance hedging strategy 
can always claim he is doing so for other 
reasons: an anticipated expense, a forecast 
of weak markets, etc. Short of prohibiting 
selling for any reason, it is impractical to 


This means that selling by hedgers following a 
Black-Scholes put-option-replicating strategy would be 
met by the buying of investors as prices fell continu- 
ously, even if hedgers’ selling (as prices fell to zero) 
were 100 percent of initial supply. Indeed, hedge sell- 
ing would have to be 10 times more intensive than this 
before price could fall discontinuously. 
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prohibit selling for hedging purposes. Nor 
would it be desirable. Investors are willmg 
to participate in a market because they can 
sell whenever they wish to, including ‘or 
risk-avoidance purposes. 

We note that the market is partially self- 
correcting. Stop-loss and dynamic hedgmg 
strategies are fully effective only when prices 
move continuously. The possibility of a cresh 
will limit the use of dynamic protection 
strategies. Of course, if relatively few in- 
vestors follow such strategies, crashes are 
unlikely to occur. 

Portfolio protection is a legitimate aim of 
private investors. Is there a way in which 
investors can achieve protection withcut 
contributing to—or suffering from—discon- 
tinuous markets? Our. analysis provices 
some clues. The most important result is 
that widespread knowledge of dynamic 
hedging usage can minimize its impact on 
markets. The preceding section showed that 
the unobserved hedging, which created a 
30-percent crash in market prices, would 
have less than a 1-percent impact on prices 
if it were observed by all investors. Dces 
this seem preposterous? Some postcrash ev- 
idence suggests that it is not. On Octoter 
19, 1988, exactly one year after the crash, 
the Japanese government sold over $24 til- 
lion of a single stock, Nippon Telephone 
and Telegraph (NT&T). This was four times 
the amount of all stocks that portfolio msur- 
ers had sold the year before. Yet NT&T 
stock did not decline by a significant amount 
(either at sale or at the time of initial an- 
nouncement), because investors had prior 
knowledge that the sale did not reflect an 
informational change. Interestingly, portfo- 
lio insurers were anxious to disseminate in- 
formation about their trading requirements 
prior to the crash, but events happened 
more quickly than regulatory approval.“© 


2 With the assistance of a major portfolio-insurarce 
firm (LOR), the New York Futures Exchange (NYFE) 
had requested the right to publicize large futures sales 
in advance. The theory behind the request was that 
preannouncement would allow time for the market to 
organize a competitive response. Prior to the crash, the 
NYFE proposal had been withdrawn, reportedly be- 
cause there were insufficient means of electronically 
disseminating the information. 
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An alternative is for hedgers to use static 
instruments that provide the same results as 
dynamic hedging strategies. For example, 
put options provide protection without re- 
quiring further trading. They would seem to 
be the ideal instrument to avoid the prob- 
lems of trading in uninformed (and there- 
fore illiquid) markets. A criticism of this 
argument is that it simply pushes the prob- 
lem back one level: the sellers of the put 
option will need to protect themselves 
through a dynamic hedging strategy. Even if 
this is true, however, at least there will be 
publicly available information about the 
number of outstanding put options. Astute 
observers can “reverse engineer” the dy- 
namic strategies that the open interest in 
such options imply. IE this information is 
widely disseminated, we will have nearly 
universal observation cf 7 strategies. 

Short of all investors being aware of 
hedging plans, our analysis also shows that 
the stability of markets is strongly affected 
by the number of supply-informed traders 
who can observe these plans. These 
market-makers play a role far beyond their 
numbers in increasing market liquidity. The 
crash that occurred in our example with no 
investors observing hedging could have been 
prevented if there had been as few as 0.03 
percent supply-informed investors (given w 
= (0.05) observing hedging supply r. 

To the extent that stock-exchange special- 
ists have privileged access to information on 
the nature of order flows, they play a key 
role in providing stability. Rules that limit 
free entry to this activity will leave markets 
considerably more vulnerable than other- 
wise. Electronic “open books” should be a 
seriously considered reform, and other 
forms of market organization (such as sin- 
gle-price auctions) should be examined. 

Low margin requirements in stock or 
derivatives markets can lead to an increased 
level of forced margin sales as prices fall. In 
effect, low margins increase the likely 
amount of stop-loss sales. If the extent of 
forced margin sales is difficult to observe, 
low margin requirements could increase the 
market’s vulnerability to crashes. 

Would price limits help? The answer is 
no-—unless such limits (and the trading halts 
caused by their being reached) permitted 
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better dissemination of information on 
hedgers’ selling. Absent this, price limits 
would only delay the ultimate crash by a bit, 
without modifying its magnitude. Certainly 
the market did not seem to benefit from the 
“trading halt” created by the weekend of 
October 17-18. 


VIL. Conclusion 


We have shown that information differ- 
ences among market participants can cause 
financial markets to be relatively illiquid. A 
small unobserved supply shock can create a 
large fall in prices. This is because the fall 
in prices affects investors’ expectations as 
well as their budgets. Traditional models 
which do not recognize that many investors 
are poorly informed will grossly overesti- 
mate the liquidity of stock markets. 

A consequence of diminished liquidity is 
that even relatively small unobserved trades 
by hedging programs can have a destabiliz- 
ing effect. We developed an example in 
which a market crash occurred when only 5 
percent of investors were following a hedg- 
ing program replicating a put option. 

Our model suggests how a crash caused 
by hedging in this country could be propa- 
gated to foreign markets, even when these 
markets do not have hedging programs such 
as portfolio insurance. Foreign investors, 
observing the large price drop in the U.S. 
market but ignorant of the extent of hedg- 
ing in that market, rationally infer that sig- 
nificant negative information must have 
been received by U.S. investors. To the ex- 
tent that this information is also significant 
for their own markets, foreign investors re- 
vise downward their expectations, causing 
prices to fall globally. 

Our model also indicates policies to mini- 
mize the chance of future crashes. These 
include the wide dissemination of knowl- 
edge about hedgers’ actions, marginal posi- 
tions, and the use of put options or related 
securities that provide hedging without re- 
quiring dynamic trading. This recommenda- 
tion supports a similar contention by Gross- 
man (1988a). Allowing wider access to the 
information in specialists’ books might also 
help to stabilize the market. In contrast, 
price limits are unlikely to have useful ef- 


fects unless they are combined with greater 
dissemination of trading information at the 
time limits are reached. 


APPENDIX 


Notation (with Example Parameters 
in Parentheses) 


Prices 
Po: current equilibrium price 
p: realized end-of-period price 
D: unconditional expected end-of- 
period price (1.06) 
D;: investor i’s conditional expecta- 
tion of end-of-period price 
unconditional variance of end- 
of-period price (0.08) 
j class j investor-conditional vari- 
ance of p 
Z: market power-weighted average 
conditional variance of p 
Information 


m: supply of shares divided by the 
sum of risk-tolerance coeffi- 
cients; expectation m (1.503), 
variance %,,, (0.00034) 
pi: p+e; price signal observed -by 
investor i in class I 
k price signal noise, uncorrelated 
across investors, uncorrelated 
with other random variables; 
ex ante variance X, (0.4) 
S: liquidity supply observed by in- 
vestors SI; mean 0 and vari- 
ance X; (0.00017) 
L: unobserved liquidity supply; 
mean 0 and variance }, 
(0.00017); L and S are inde- 


pendent 
Investors 

SI: supply-informed investor class; 
observe po and $ 

I: price-informed investor class; ob- 
serve po and p? 

U: uninformed investor class; ob- 
Serve Do 

j: investor class SI, I, or U 

a;: investor-class j risk tolerance 


wj: number of investors in class J 
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k: relative market power of class j; 
ratio of the products of w, and 
a; to the sum across classes: 
k, = = ajw;/ daw, (k; = 0.02, 
ko = = 0.005, ky = ‘0, 975) 

m(p,): hedging share supply 

w: fraction of share total hedged 

(5 percent) - 


PROOF OF THEOREM 1: 

We will assume that investors believe the 
function f~} to be well-defined (i.e., a given 
equilibrium price level pọ obtains for only 
one possible realization of. the argument of 
the function f). Subsequently we will show 
that this belief is: confirmed in equilibrium. 
The variance-covariance matrix V of the 
three-signal.vector 


, 


P 
S 


f (po) 


and the covariance vector W of the signal 
vector with the future price are given by 


yard, 0 n > 
V= 0 Da —IdXs 
2 -Is E+H?’}, +I’ Ès 


For simplicity, we have omitted the sub- 
script i (for investor i) of p’. The distribu- 
tion of end-of-period prices conditional on 
all three signals is normal with expectation 
Py and variance Z. Defining [Ay, By, Cil 
= W'V~!, where WT denotes the trans- 
pose of W, leads to 


Zn=%—Cov{p,[p',8,f°'(p0)]}" 
V-'Cov{p, | p',S, f *( Do) |} 
Zy =X —(Ay®+ Cyd) 
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(see, e.g., Morris DeGroot, 1975). Straight- 
forward and lengthy manipulation of the 
equations leads to 


J 1 1 i 
= oS bo 
SS 2 he De 


1 
Zi An=> 
Zi Bu= pay, 
L 
Ai C= Fay 
L 


These parameters would obtain for an 
investor who could observe all the signals. 
To derive the corresponding parameters for 
the supply-informed investors (SI) it suffices 
to take the limit of 2, at infinity. For in- 
vestors I and U, who do not observe the 
signal S, the parameters are obtained by 
replacing H* “>,, the contribution to the 
variance of f~'{-) due to unobserved liquid- 
ity trading, with H*%,+JI?X. in the ex- 
pression for the corresponding parameters 
for II and SI, respectively. This yields 





Z3As, =0 
ZB : 

SI “SI H2> 
ZC : 

SI “SI H2> 

1 1 1 os 
- I L Rare] 

Zu = 

I I >. 

Z;'B,=0 
Ama a en 

ren 3 Dyn. oe Ga 


1 1 = 
ags 3 "WS, + oa 
Zg =0 | | 
Zy'By =0 
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1 


Z5'Cy = = 
T H?’Z, + apr 


The corresponding market power weighted 
averages are given by 


1 k H? S, + kel? 
Zia Zp = A IS) 
2 Fini 5 5, H° (HZ, +P E;) 
2 ky 
i E 
S I 
B= Uk; By= ksi as 
Cz kz; 'C; 
1 ki+ky 
= kar H. 
"HEr Hpt PS, 


The total demand for shares of the three 
classes of investors is equal to the total 
supply plus hedging supply: 


pal (D; — Po) =m+ TT. 


(A1) LK Z 


Reorganizing terms yields, at the limit of 
economies with an infinite number of agents, 


Z~'py +a — Cf (po) - Z~ ptm 
a e 
 1-B 


1 
—_— —, nN — p —_ e e ú 
p-p L 8 


This equation is consistent with equation (3) 
if and only if the following set of equations 
holds: 


a 1 i 1- B 
A A 
Z`po+tr-Z pt+m 
f Cpo) = 
A+C 


Substituting A, B, and C yields the unique 


solution for H, J, and f+: 


(A3) H=— I=H Hksı 
7 E HÈ, + kg 
Z`potr-zZ'p+m 
f-"(20) = =] 


The solution f~! is a well-defined function, 
as asserted above. 


PROOF OF PROPOSITIONS 1 AND 2: 

The function f~! is continuous. Conse- 
quently, the function f is well-defined and 
continuous if and only if f~t, or equiva- 
lently Z~'p, + T, is strictly monotonic. If 
a( Po) = 0, f~} is strictly monotonic, since 
Z~'>0; hence f is well-defined and contin- 
uous. 


Excess Demand. Substitution of the solu- 
tions in equation (A2) yields the excess de- 
mand (demand minus supply): 


1 
XD, = =| p-P- HL- IS 


Z~"(B- Po)— (M+ 7) 
| | 


The Linear Case. When the demand stem- 
ming from dynamic strategies is linear in po 
(i.e., a’ is constant), f(-) is a linear func- 
tion. In the case of no hedging supply (r = 
0), the equilibrium price po is given by 


gl 
+p- Z. 


We will denote by F the slope of the func- 
tion f; in this case, F=1—Z/ 2, In the 
context of our example, we have Z~* = 25.06 
and 


Po = 9.5 ( p —1.06—19.95L — 8.148) +1. 


GOLI 
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PROOF OF PROPOSITION 3: 

We first restrict our attention to the do- 

main of py where f~'( pp) is strictly increas- 
ing. From (A3), f~'(po) is also differen- 
tiable, with derivative Z~' + m'( p) > 0 over 
this domain. As hedging activity w(p,) in- 
creases, the derivative of f~'(p,) decreases, 
since m’<Q. Therefore, the derivative of 
fC) becomes larger, and the current price 
becomes more sensitive to changes in the 
signals. Since the signal volatility is exoge- 
nous, this in turn implies that the current 
price is more volatile. For sufficiently large 
hedging activity, f~' actually decreases over 
the range of prices p, for which 
(A5) -T'È Po) > Z7}. 
Therefore, f(-) is multivalued, and disconti- 
nuities within the set of stable equilibria can 
occur, as demonstrated in the example of 
Section VI. 


PROOF OF THEOREM 2: 

The proof closely follows that of Theo- 
rem 1. The difference consists in the agents’ 
different beliefs about the structure of equi- 
librium prices. In the first case, supply- 
informed agents (SI) are aware of hedging 
strategies and of their impact on prices. SI 
agents know f~t, the actual inverse price 
function which obtains in equilibrium. Other 
agents, ignorant of the presence of hedgers, 
think that the linear functional form holds. 
We assume that SI agents believe the coef- 
ficients H and 7 to be unchanged and show 
that it indeed holds in equilibrium. Equa- 
tion (A1) still holds, and a similar manipula- 
tion leads to the analog of (A2): 


E OA (447) 49-00) 


Z-1-371 
(A6) a: 
ek 
_ 1, 18 
a eo ae 


Hence, the parameters H and I are un- 
changed, and f~'(p,) is given by the left- 
hand side of equation (A6), because SI 
agents know the true inverse equilibrium 
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price function f~'(-). The solution f~'(p,) 
is given by 


(A7) f° *( Bo) 


Z~"( Do =(p= Zm)) 
a 


If hedging activity is totally unobserved, 
similar derivations yield the same parame- 
ters H and J as before, and the new inverse 
equilibrium price function 


(A8) f7'( po) 


i Z~ "(po -(p- Zm 
= lS i UP = )) + HA7. 
a a 


PROOF OF PROPOSITIONS 4 AND 5: 

The derivative of the inverse equilibrium 
price function d(f~')/dp, is equal to F7' 
+(Z~!—3%~1+)~lr’ in the fully observed 
case, to F~!~-J7r' in the partially observed 
case, and to F~!+ Hr’ in the unobserved 
case [eqs. (A3), (A7), and (A8)]. It is small- 
est for the unobserved hedging activity case, 
and it is smaller under partial observation 
than in the fully observed case, because m’ 
is negative and H>I>(Z~'— $71)! [by 
combining the definition of Z~' and eq. 
(A3)]. This implies that the derivative of f 
(and therefore the volatility of p,) is largest 
in the case of unobserved hedging activity 
and least in the case when hedging is fully 
observed. The derivatives are negative if 
—q'>Z'=(Z~'—Z~')F7! in the ob- 
served case, if — 7’ >(FI)~! in the partially 
observed case, and if — 7'>(FH)~'! in the 
unobserved case. Hence, as hedging activity 
increases, discontinuities appear first in the 
unobserved case, then in the partially ob- 
served case, and finally in the perfectly ob- 
served case. 
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Insider Trading In A Rational Expectations. Economy 


By LAWRENCE M. AUSUBEL* 


It is often argued that efficiency considerations require society to freely permit 
insider trading. In this article, an opposing efficiency argument is formalized. The 
model incorporates an investment stage followed by a trading stage. If “outsiders” 
expect “insiders” to take advantage of them in trading, outsiders will reduce their 
investment. The insiders’ loss from this diminished investor confidence may more 
than offset their trading gains. Consequently, a prohibition on insider trading may 
effect a Pareto improvement. Insiders are made better off if they can precommit 
not to trade on their privileged information; government regulation accomplishes 


exactly this. (JEL 022, 026, 313) 


The traditional rationale articulated for 
insider trading regulation and other securi- 
ties law is that such rules promote confi- 
dence in markets. Indeed, President 
Franklin D. Roosevelt justified the first ma- 
jor U.S. securities legislation by saying: “It 
should give impetus to honest dealing in 
securities and thereby bring back public 
confidence”.! Similar language is still in- 
voked half a century later, in connection 
with enforcement efforts against insider 
trading and in proposals for tightened stock 
market regulation. 


*Department of Managerial Economics and Deci- 
sion Sciences, J. L. Kellogg Graduate School of Man- 
agement, Northwestern University, 2001 Sheridan 
Road, Evanston, IL 60208. My research received the 
gracious support of the Lynde and Harry Bradley 
Foundation, the Kellogg School’s Banking Research 
Center, and National Science Foundation Grant SES- 
86-19012. I thank Laurie Bagwell, Mike Fishman, Julie 
Nelson, Matt Spiegel, and three anonymous referees, 
as well as seminar participants at the 1988 Winter 
Meetings of the Econometric Society, the Midwest 
Mathematical Economics Meetings (April 1989), 
Northwestern University, and Indiana University, for 
helpful comments. 

177 Congressional Record 937 (March 29, 1933). 
The quote is taken from President Roosevelt’s message 
to Congress in proposing legislation that became the 
Securities Act of 1933. This act requires the disclosure 
of information in connection with the initial offering 
of a security. The legislative underpinnings of fed- 
eral insider trading regulation are contained in the 
closely related Securities Exchange Act of 1934 (see 
Section I). 
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Yet the weight of academic law-and-eco- 
nomics commentary has been opposed to 
the regulation of insider trading. Scholars 
have argued that permitting trade on the 
basis of inside information creates desirable 
incentives and, for a variety of reasons, im- 
proves economic efficiency. At the same 
time, it has been maintained that the fact of 
prices reflecting information would prevent 
insiders from earning significant trading 
profits at the expense of outsiders or that, in 
any event, outsiders are not harmed by in- 
sider trading. 

My objective in the current article will be 
to reformulate the confidence rationale as 
an economic argument for insider trading 
regulation. I develop a two-stage model, 
consisting of an investment stage followed 
by a trading stage. In the initial period, 
agents make their investment decisions 
based on their expected second-period re- 
turns, which in turn hinge on whether they 
will be “insiders” or “outsiders” and on 
whether insiders will be permitted to trade 
on their private information. The second 
period is a pure exchange economy (of en- 
dowments determined by the first-period in- 
vestments) in which it is feasible for insiders 
to exploit their private information in a 
partially revealing rational expectations 
equilibrium. 

For many plausible specifications of the 
model, the outcome when society regulates 
insider trading is a Pareto improvement over 
the outcome when insider trading is permit- 
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ted.? Under such scenarios, economic effi- 
ciency would require the banning of insider 
trading. The intuition for this conclusion, 
which contradicts most previous economic 
analyses of insider trading, is as follows. 
Abolition of insider trading in an exchange 
situation will typically improve the expected 
return on investment of outsiders. If the 
quantity of investment increases in the ex- 
pected return,’ then insider trading regula- 
tion promotes investment by outsiders. To 
the extent that insiders are helped by in- 
creased outside investment,’ insiders thus 
also benefit from insider trading regulation. 
In other words, insiders are made better off 
if they can somehow precommit not to trade 
on their privileged information; government 
regulation and enforcement accomplish ex- 
actly this. 

My analysis thus provides an economic 
formalization of the notion of confidence in 
markets. Let “confidence” be interpreted as 
the rational belief by outsiders that their 
return on investment is not being diluted by 
insiders’ trading. Then, perhaps, the goal of 
insider trading regulation and securities law 
truly is to foster confidence in markets. 
When confidence is promoted, outsiders and 
insiders may benefit alike. 

The article is organized as follows. In 
Section I, I define insider trading and criti- 
cally discuss the related literature. Section 
Il provides an overview of the structure and 


“The reader should be alerted to the words “many 
plausible specifications of the model” in this sentence. 
This article demonstrates that, in some specifications, 
a ban on insider trading effects a Pareto improvement. 
In others, regulation works to help outsiders but harm 
insiders. See Sections VI and VII. 

Contemporary government policies designed to 
promote investment and savings seem to be premised 
on the notion that the quantity of investment increases 
in the expected return (i.e., that investment is not a 
Giffen good). 

Traditional corporate insiders (e.g., officers and 
directors) benefit from outside investment, because this 
investment is a source of needed capital for their 
organizations. Nontraditional insiders (e.g., investment 
bankers and arbitrageurs) also benefit from outside 
investment, because this investment is the origin of 
initial public offerings and secondary trades, which 
again contribute to the insiders’ livelihoods. 
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essential ingredients of the model.’ In Sec- 
tion III, I formally develop the trading stage 
when insider trading is permitted; in Sec- 
tion IV, I formally develop the investment 
stage. Section V modifies the model to treat 
a regulatory regime in which insider trading 
is banned. Section VI contains a welfare 
analysis of insider trading regulation for a 
set of parameter values in the model econ- 
omy. Section VII provides my conclusions. 


I. A Brief Review 


A. Insider Trading Defined 


Insider trading occurs when an individual 
(commonly called an “insider’’) buys or sells 
securities on the basis of material, nonpub- 
lic information. American law imposes an 
abstain or disclose requirement on insiders. 
Suppose that an individual has privileged 
access to corporate information which is not 
generally available and which materially af- 
fects investment decisions concerning the 
company’s stock. Under the doctrine of In 
re Cady, Roberts & Co. 5 and SEC v. Texas 
Gulf Sulphur Co., ĉ the insider is required to 
choose between two options: he may either 
abstain from engaging in any trading activity 
in the security in question until such time 
that the information becomes public; or he 
may, himself, publicly disclose the informa- 
tion to the marketplace before trading. Fail- 
ure to abstain or disclose may subject the 
insider to civil liability and criminal prose- 
cution under Rules 10b-5 or 14e-3, which 
were promulgated by the Securities and Ex- | 
change Commission under rule-making au- 
thority granted by Congress in Sections 10(b) 
and 14(e) of the Securities Exchange Act of 
1934. 


°40 SEC 907 (1961). 

°401 F.2d 833 (2d Cir. 1968) (en banc), cert. denied, 
394 U.S. 976 (1969). 

Rule 10b-5 is a general prohibition on fraudulent 
acts and practices connected with the trading of securi- 
ties: the subsequent case law (beginning with the Cady, 
Roberts and Texas Gulf Sulphur cases) has interpreted 
this rule to proscribe insider trading. Rule 14e-3 is a 
ban on fraudulent or deceptive acts and practices 
specifically connected with tender offers. 
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The term “insider,” as used here, refers 
not merely to a traditional corporate insider 
but more broadly to any individual whose 
actions are confined by the insider trading 
laws. Whether an individual is considered to 
be an insider, and thus whether he is bound 
_ by the abstain-or-disclose requirement, may 
depend on which of Rules 10b-5 or 14e-3 is 
being applied. In order to violate Rule 10b- 
5, the individual must be linked with the 
firm whose security is traded or with the 
nonpublic information in such a way that 
the use of the information in his trading is 
deemed to breach some fiduciary duty. Un- 
der limitations set forth in Chiarella v. 
United States 8 and affirmed in Dirks v. 
SEC, ° the insider’s fiduciary duty may de- 
rive from: (a) working inside the firm (e.g., 
employment as an officer or director of the 
company whose securities are traded); (b) 
working outside the firm in a capacity which 
nevertheless leads to an obligation to share- 
holders (e.g., employment as an investment 
banker, lawyer, or accountant for the com- 
pany whose securities are traded); or (c) 
receiving information from another individ- 
ual whose conveyance of the information 
itself constitutes a breach of duty (e.g., re- 
ceiving a tip from a corporate officer who 
expects to benefit from the disclosure). Al- 
ternatively, under the so-called misappropri- 
ation theory, if an individual trades on the 
basis of information misappropriated from 
its source (typically, taken from the individ- 
ual’s employer), the misuse of information 
may itself constitute the breach of fiduciary 
duty that is required for conviction under 
Rule 10b-5.1° 


8445 U.S. 222 (1980). 

"463 U.S. 646 (1983). 

The misappropriation theory was adopted by the 
U.S. Court of Appeals, 2nd Circuit, in United States v. 
Newman, 664 F.2d 12 (2d Cir. 1981), cert. denied, 104 
S.Ct 193 (1984). It was considered inconclusively by the 
U.S. Supreme Court in Carpenter v. United States, 108 
S.Ct. 316 (1987). Divided in a 4—4 vote, the Court 
failed to overturn the insider trading convictions of 
Wall Street Journal reporter R. Foster Winans (and 
others) for prepublication trading on the basis of infor- 
mation that would appear in his “Heard on the Street” 
column. 
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In contrast, Rule 14e-3 does not contain 
any duty requirement in its notion of who 
is an “insider.” (It does, however, retain 
the notion from Rule 10b-5 that “willful 
misconduct” is a prerequisite to any viola- 
tion.!’) If any individual (not necessarily a 
person who has breached a fiduciary duty) 
trades while in possession of material, non- 
public information connected with .a tender 
offer by another party,'* he may be subject 
to prosecution for insider trading under 
Rule 14e-3.1>14 


B. The Classic Law-and-Economics View 


The classic law-and-economics view on 
insider trading can be briefly summarized as 
follows. Insider trading is banned today out 
of considerations of fairness. In an unregu- 
lated environment, insiders might be able to 
earn trading profits by utilizing information 
that outsiders cannot legally obtain. Out of 
some sentimental attachment to fairness, we 
enact insider trading regulations in order to 
level the securities market playing field, so 
that all traders have relatively equal access 
to information. 

Unfortunately, considerations of eco- 
nomic efficiency work in the opposite direc- 


“United States v. Chestman, 704 F. Supp. 451 
(S.D.N.Y. 1989); United States v. Marcus- Schloss & 
Co., Inc., 710 F. Supp. 944 (S.D.N-Y. 1989). 

A party planning to make a tender offer (or his 
agent) is permitted to purchase shares on the open 
market in advance of a public announcement—pro- 
vided he does not run astray of other provisions of the 
Williams Act. 

‘It should be added that there exists another fed- 
eral rule under which (only civil) insider trading liabil- 
ity is possible. A traditional insider (an officer, director, 
or major shareholder) is liable to his company for any 
profits he earned from matched purchases and sales of 
securities within the same six-month period (irrespective 
of whether it can be shown that he possessed material, 
nonpublic information), under Section 16 of the Secu- 
rities Exchange Act of 1934. The presumption behind 
this provision on “short swing” trading seems to be 
that, whenever an insider buys and sells in close prox- 
imity, it is likely to be on the basis of (possibly uniden- 
tifiable) private information. 

The discussion of fiduciary duty contained in the 
second and third paragraphs of this section is largely 
drawn from Chapiers 3, 6, and 7 of Donald Langevoort 
(1990). 
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tion as those of fairness. First, if insiders 
are permitted to trade freely on their pri- 
vate information, then information becomes 
more rapidly reflected in securities prices. 
Insider trading thus contributes to efficient 
markets and so to allocational efficiency, as 
proper capital-asset pricing leads to the op- 
timal allocation of capital resources. Sec- 
‘ond, “profits from insider trading constitute 
the only effective compensation scheme for 
entrepreneurial services in large corpora- 
tions” (Henry Manne, 1966b p. 116). As 
Manne viewed the world, individuals do lit- 
tle innovation except when they are af- 
forded the opportunity to share in the value 
they create; in large organizations, insider 
trading is basically the only mechanism for 
employees to obtain compensation for their 
innovations. 

Furthermore, the fairness considerations 
are misplaced, as insider trading is effec- 
tively a victimless crime: 


The insiders’ gain is not made at the 
expense of anyone. The occasionally 
voiced objection to insider trading— 
that someone must be losing the spe- 
cific money the insiders make—is not 
true in any relevant sense. 

[Manne, 1966a p. 61] 


Even if the redistributional concerns are 
real, they are difficult for economists to 
evaluate (or are irrelevant) because any 
profit derived from insider trading is an 
essentially costless transfer payment. Fi- 
nally, the fact that insider trading by the 
company’s own management is typically not 
banned by explicit provisions of the corpo- 
rate charter may be taken as evidence that 
governmental insider trading regulation 
does not enhance shareholder value (Dennis 
Carlton and Daniel Fischel, 1983). 


C. Some Criticisms of the Classic View 
The model described in the following sec- 


tions of the article does not directly address 
some of the above arguments. Instead, I 
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stake out a new argument which cuts in the 
opposite direction. Thus, before proceeding 
with the new model, it may prove useful to 
review and articulate some direct responses 
to the classic view. 

In a very general sense, there exists a 
fundamental tension in the viewpoint that 
insider trading promotes economic effi- 
ciency. As Manne recognized in the first 
sentence of his 1966 treatise (but which 
remains equally true today), “Probably no 
aspect of modern corporate life has been 
more roundly condemned than insider trad- 
ing.” It is somewhat awkward to reconcile 
his view (of insider trading as the guarantor 
of efficient markets and the protector of 
entrepreneurship in the modern corpora- 
tion) with the almost universal opprobrium 
that society directs toward practitioners of 
insider trading. 

If insider trading is efficient (or even if 
insiders as a group benefit from the prac- 
tice), there remains a political economy 
puzzle as to why insider trading regulations 
are ever promulgated. As David Haddock 
and Jonathan Macey (1987 p. 312) observe: 


Modern public-choice theory suggests 
that regulatory actions, including the 
decisions of the SEC, will divert wealth 
from relatively diffuse groups toward 
more coalesced groups whose mem- 
bers have strong individual interests in 
the regulation’s effect. Yet, if one 
adopts the conventional view that the 
battle lines of insider trading regula- 
tion are drawn between insiders and 
ordinary shareholders (or the general 
public), the SEC would seem to be 
channeling wealth that otherwise 
would be captured by a group with 
relatively cohesive interests (insiders) 
toward those with extremely weak and 
diffuse interests (ordinary sharehold- 
ers or the general public). 


The fact that, empirically, we witness pro- 
scriptions against the practice becomes 
much less a mystery if insiders have a group 
interest in precommitting not to trade on 
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their private information, as I will argue 
below.” 

More specifically, Manne’s incentive ar- 
gument has been criticized on account that 
insider trading, as a compensation device, 
creates a moral hazard problem.’® An indi- 
vidual who has the abilities both to generate 
and to trade on inside information is given 
the perverse incentive to generate “bad” 
news, which is easier to create than “good” 
news yet equally profitable to trade on (by 
selling short, instead of buying long). 
Meanwhile, a company-granted call option 
is probably a more finely tuned instrument 


for giving an employee a stake in the value. 


of the corporation’s stock than is legalized 
insider trading (also avoiding the moral haz- 
ard problem). In any case, the incentive 
argument would not appear to be especially 
relevant to the recent rage of insider trad- 
ing, which has mostly involved market pro- 
fessionals (e.g., investment bankers and 
arbitrageurs), rather than traditional corpo- 
rate “insiders” engaging in entrepreneurial 
activities. 

Researchers have also challenged the ro- 
tion that insider trading necessarily in- 
creases the rapidity with which information 


Sas evidence that insiders may actually wish to 
quash insider trading, see the recent comments of 
Arthur Levitt, Jr., chairman of the American Stock 
Exchange: “If the investor thinks he’s not getting a fair 
share, he’s not going to invest and that is going to hurt 
capital formation in the long run” (quoted in Business 
Week, April 29, 1985, p. 79). Also observe that the 
Insider Trading and Securities Fraud Enforcement Act 
of 1988, which greatly increased the monetary penalties 
and jail terms for these crimes, ultimately gained the 
backing of the Securities Industry Association, a lead- 
ing industry group (as reported in The New York Times, 
October 23, 1988, pp. 1, 15). Finally, it is interesting 
that both the 1988 act and the Insider Trading Sanc- 
tions Act of 1984 (which stiffened penalties and plugged 
the options loophole) passed the U.S. Congress with- 
out any dissenting votes. 

t6 For a longer discussion of the moral hazard prob- 
lem, see Joel Seligman (1985 pp. 1094-6). For a gen- 
eral discussion of the relative merits of compensating 


managers by allowing them to trade on private infor- 


mation, see Ronald Dye (1984). 

"Tt is worth observing here that Section 16 of zhe 
Securities Exchange Act of 1934 prohibits short selling 
by officers and directors of shares in their own ccm- 


pany. 
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becomes reflected in stock prices. Victor 
Brudney (1979 note 43), Frank Easterbrook 
(1981) and others have noted that the 
prospect of insider trading may give corpo- 
rate insiders an incentive to delay the dis- 
closure of information to the marketplace. 
In a recent paper, Michael Fishman and 
Kathleen Hagerty (1989) argue that the 
presence of insider trading may discourage 
outsiders (e.g., stock analysts) from inde- 
pendently generating information, perhaps 
leading to less informative securities prices. 

Finally, it has been observed that the 
failure of firms to ban insider trading on 
their own does not constitute conclusive evi- ` 
dence that public regulation is inefficient. 
One retort is offered by Richard Posner 
(1986 p. 393), who notes that “if the proba- 
bility of detection is so low that heavy 
penalties—which private companies are not 
allowed to impose—would be necessary to 
curtail the practice, it might not pay compa- 
nies to try to curtail it.” Examples of “heavy 
penalties” available only to public enforcers 
include prison terms and lifetime debar- 
ment from the securities industry. More- 
over, if insider trading is forbidden by the 
government and if such laws are not ex- 
pected to change, then the presence of trad- 
ing restrictions in corporate charters would 
be redundant and unnecessary. 


D. Other Related Literature 


There exists a fairly extensive empirical 
and experimental literature on insider trad- 
ing. James Lorie and Victor Niederhoffer 
(1968), Jeffrey Jaffe (1974), Nejat Seyhun 
(1986), and others have examined the prof- 
itability of trading rules based on the actual 
purchases and sales of corporate officers, 
directors and major stockholders (who are 
required, by Section 16 of the Securities 
Exchange Act of 1934, to report their trans- 
actions). The studies have found that insid- 
ers can, in fact, earn extranormal trading 
profits. Extensive (unpublished, but widely 
publicized) experimental work on insider 
irading was conducted in the mid-1980’s by 
R. Foster Winans, Dennis Levine, Ivan 
Boesky, Drexe! Burnham Lambert Inc., and 
others. The experimental studies were able 
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to replicate the conclusions that had been 
reached by the earlier empirical articles. 

A long line of theoretical articles has 
addressed the issue of information revela- 
tion in an asymmetically informed market 
with a large number of traders. In Beth 
Allen (1981), Douglas Diamond and Robert 
Verrecchia (1981), Sanford Grossman and 
Joseph Stiglitz (1980), James Jordan (1983), 
Roy Radner (1979), and other models, some 
or all of the informed agents’ private infor- 
mation is revealed to uninformed agents via 
the inversion of the rational expectations 
equilibrium price function. I have provided 
a more thorough review of the microeco- 
nomic rational expectations equilibrium lit- 
erature in the introduction of a previous 
article (Ausubel, 1990). 

Some other theoretical articles have ex- 
amined information revelation in a market 
in which only a single agent possesses a 
relevant piece of information. Some of this 
literature also explicitly discusses insider 
trading. Douglas Gale and Martin Hellwig 
(1987), Richard Kihlstrom and Andrew 
Postlewaite (1983), Albert Kyle (1985), 
Jean-Jacques Laffont and Eric S. Maskin 
(1990), and others have studied the strategic 
revelation of information, which necessarily 
becomes an issue in this context.’® 


II. An Overview of the Model 


Consider a two-period model with two 
goods, two components to the state of the 
world, and two types of representative 
agents, who are denoted insiders and out- 
siders. In the first period, which occurs be- 
fore any agent has received private informa- 
tion, insiders individually decide how much 
labor to invest in producing good x and 
outsiders individually decide how much la- 


18 Insider trading research by Roland Benabou and 
Guy Laroque (1989), Utpal Bhattacharya and Matthew 
Spiegel (1989), Jurgen Dennert (1989), and Michael 
Manove (1989) has also come to my attention since the 
initial preparation of the current article. These papers 
use a wide diversity of modeling techniques but share 
with the current article a healthy degree of skepticism 
toward the efficiency claims of proponents of insider 
trading. 
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bor to invest in producing good y. Between 
the first and second periods, insiders are 
privately informed of the state of the world, 
while outsiders receive no private informa- 
tion. (The state affects agents by entering 
into their state-dependent utility functions.) 
In the second period, insiders and outsiders 
trade in a pure exchange economy, where 
agents’ “endowments” (which are now 
treated as exogenous) equal their invest- 
ment decisions of the first period. 

Even in a trading process where insiders 
are permitted to trade freely on the basis of 
their private information, outsiders learn at 
least some of the insiders’ information, via 
the rational expectations equilibrium (REE) 
price function. However, since there are two 
components to information but only one 
relative price to reveal it, the REE is only 
partially revealing. AS a consequence, out- 
siders make nontrivially inferior decisions to 
those they would make under full informa- 
tion. Finally, after the second period, all 
information becomes public, and utilities are 
realized based on the state of the world and 
agents’ holdings of the two goods at the end 
of the second period. The timing of events 
is illustrated in Figure 1. 

I now introduce a modeling device which 
is meant to represent insider trading regula- 
tion. Any agent who has been designated an 
insider is given the choice between two al- 
ternatives in the trading round: he may 
either abstain from trade and, hence, con- 
sume precisely his endowment brought for- 
ward from the investment round; or he may 
publicly disclose his information to the mar- 
ketplace before trading. This modeling de- 
vice is intended to capture the essential 
empirical details of the first paragraph of 
Section I, while abstracting away from any 
of the legal technicalities in the second and 
third paragraphs of that section. Under a 
regulatory regime where insider trading is 
banned in this manner, each of the repre- 
sentative insiders in the model is induced to 
disclose his information, so that the trading 
round is transformed into one of complete 
information. The outcome of the trading 
round is thus changed, and in anticipation 
of that change, the outcome of the invest- 
ment round also changes. 
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Insiders and outsiders 
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producing goods, 
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Insiders are 
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of the world, 
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Insiders and outsiders 
trade in a pure 
exchange economy, 
with endowments 
determined by their 
period-one actions. 
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Time. 


Outsiders are 
informed of 
the state 
of the world. 
State-dependent 
utilities are realized 
based on holdings 
determined in period two, 


FIGURE 1. TIMING OF EVENTS IN THE INSIDER TRADING MODEL 


This analysis makes three basic method- 
ological advances over previous research on 
insider trading. First, I examine the ex ante 
efficiency of two regulatory regimes, com- 
paring agents’ expected utilities before the 
start of both trading and the underlying 
investment. Previous analyses have ignored 
incentive effects on the level of investment 
activity, only examining utilities preceding a 
trading round:!° Second, I fully specify all 
agents’ utility functions and perform a com- 
plete welfare comparison. In contrast, ear- 
lier work typically specified trading rounds 
utilizing “noise traders” (who are not ex- 
plicitly given utility functions) and linear- 
quadratic functional forms (which yield 
prices that are sometimes negative); inclu- 
sion of either of these features makes wel- 
fare analysis problematical. Third, I prove 


Some earlier commentators have informally taken 
an ex ante view in discussing the fairness of insider 
trading. For example, Kenneth Scott (1980 p. 809) 
writes: “The. fairness concern proves to have surpris- 
ingly little substance when viewed in terms of the game 
as a whole rather than as a single, isolated play.” Just 
as Scott used an ex ante approach to counter the usual 
fairness argument, I use an ex ante approach to cocnter 
the usual efficiency argument. 


that my model exhibits a unique equilib- 
rium. Other articles on the subject have 
frequently chosen a convenient (i.e., linear) 
solution without resolving whether there '€X- 
ist other equilibria possessing possibly dif- 
ferent qualitative characteristics. 

Rational expectations equilibrium in a 
competitive economy specifically models a 
situation in which‘there are large numbers 
of insiders and large numbers of outsiders. 
In particular, insiders act not only as price- 
takers but also as information-takers; the 
modeling technique has insiders ignore that 
they are affecting the aggregate amount of 
information available to outsiders when the 
insiders determine their use of information. 
This modeling device might fairly well de- 
scribe a situation such as the November 
1988 takeover of Triangle Industries by 
Pechiney, for which it has been reported 
that at least eight separate buyers indepen- 
dently bought stock in advance of the acqui- 
sition announcement (on the basis of mate- 


rial nonpublic information).”° 


” 


See, for example, The New York Times, January 
30, 1989; p. 28. The reader may initially react to the 
exchange-economy formulation of the trading round in 
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The REE modeling device would not so 
well describe a scenario in which a single 
trader uniquely possessed the private infor- 
mation. However, it seems evident that sub- 
stitution of a trading round along the latter 
lines would only exacerbate the efficiency 
problem. If the insider need not behave as 
an information-taker, his trading profits 
would presumably increase, further inhibit- 
ing outsiders from investing. Competitive 
use of information is apparently the most 
_ friendly terrain for favoring insider trading; 
monopolistic use of information would seem 
only to strengthen the case against it. 

It is illuminating to highlight briefly the 
ingredients of the model that drive the re- 
sults. First, in order for insiders to profit at 
the expense of outsiders, it is necessary that 
the equilibrium of the trading round be only 
partially revealing. Second; in order for the 
disparity of information in the trading round 
to make a difference, it is necessary that 
insiders and outsiders have different prefer- 
ences on the underlying commodities.”! 


Section 4 as bearing little relation to a corporate 
takeover context such as Pechiney/Triangle. The rela- 
tion becomes much clearer if the terms of the model 
are reinterpreted. The state-dependent utilities in my 
model could be viewed as representing the conditional 
expected utilities that shareholders obtain from hold- 
ing Triangle’s stock. The two components of the state 
of the world could be thought of as a continuous 
random variable, 8, representing the intrinsic earnings 
potential of Triangle’s capital assets, and a dichoto- 
mous random variable, ¥, representing whether or not 
Pechiney is planning a takeover. 

This is a natural assumption to make in the mar- 
ket for commodities I use here, since different agents 
can easily have different preferences over the two 
commodities. It may not be so obvious to the reader 
that heterogeneity of preferences is as natural an as- 
sumption to make in a market for stocks, since every- 
body prefers a high return to a low return. In fact, 
some recent papers have argued that heterogeneity of 
preferences for stocks is quite natural. Laurie Bagwell 
(1988) and Yves Balcer and Kenneth Judd (1987) show 
that, in the presence of a capital gains tax which is 
imposed upon the realization (rather than the accrual) 
of a gain, current (taxable) shareholders who pur- 
chased the stock at different prices have different ob- 
jectives. Bagwell and Judd (1988) demonstrate that 
shareholders with different levels of risk aversion and 
different marginal propensities to consume have dif- 
ferent objectives. Empirical work is also beginning to 
confirm this assumption. Andrei Schleifer (1986) finds 
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Third, in order to obtain the strong welfare 
result that banning insider trading makes 
everybody better off (in some examples), it is 
useful that: (a) income effects are such that 
the quantity of investment by outsiders in- 
creases as the return on investment in- 
creases; and (b) insiders derive some benefit 
from the investment of outsiders. I will for- 
mally outline the model by first giving a 
description of the second stage of the model 
and then giving a description of the initial 
stage. 


HI. The Trading Stage When Insider 
Trading Is Permitted 


When insider trading is permitted, the 
second period is modeled as an example of 
my (1990) partially revealing rational expec- 
tations model. The state of the world con- 
sists of two independent random variables: 
a continuous random variable, B, which is 
uniformly distributed on the unit interval 
I =[0,1]; and a dichotomous random vari- 
able, ¥, which takes on the two elements of 
r ={H,T} (“‘heads” or “tails”) with proba- 
bilities A assigned to H and 1— h attached 
to T. The realization, (8, y), is payoff-rele- 
vant to agents because it enters into their 
(state-dependent) utility functions. 

Agents are divided into two types, accord- 
ing to their private information. There is a 
continuum of identical insiders (whose utili- 
ties, endowments, and demands are sub- 
scripted by 1) and a continuum of identical 
outsiders (subscripted by 2), each indexed 
by the unit interval. Insiders privately learn 
precisely the true realization of (B,7) be- 
tween periods one and two, while outsiders 
do not directly learn the realization until 
after period two. However, as we shall see, 
outsiders will indirectly infer some informa- 
tion about the state by observing the price 
(which, in turn, is influenced by the insiders’ 
actions). 





that demand curves for the purchase of stock (which 
are added to the S&P 500 Index) are downward slop- 
ing, as Opposed to horizontal. Bagwell (1989) finds that 
supply curves for the sale of stock (in Dutch-auction 
repurchases) are upward sloping, as opposed to hori- 
zontal. 
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There are two goods, denoted x and y. 
Prices for the two goods are assumed to be 
nonnegative and are normalized to sum to 
one. I usually only explicitly mention the 
price of good x, which I denote by the 
function p(-,-) and the scalar ¢. A repre- 
sentative insider begins period two with an 
exogenous endowment (X,, J4), where X,.¥, 
> 0, and trades to a consumption of (x,, ¥,), 
where x,,y, = 0. Similarly, a representative 
outsider begins period two with an exoge- 
nous endowment (¥,, 2) and trades to a 
consumption of (x,,y,). Since each of the 
two types of agents is indexed by an interval 
of length one, the aggregate endowments 
and demands are also given by (X,,¥,), 
(x1, Y1), (%2, 2), and (x2, y2). In the next 
section, endowments will be endogenized 
when we introduce period one; we will then 
have ¥,>0, ¥,>0, and x, =0= yy. 

Agents’ utilities derived from consump- 
tion are given by state-dependent, Cobb- 
Douglas. utility functions. Let the represen- 
tative insider’s utility function be given by 


(1) U,(x1,913B 5) 
reny -aB if y =H 


ify=T 


x27 )yll -are 


where a;,(6) = B"", œ (8) = B*T, and py 
and uy are unequal positive constants. Let 
the representative outsider’s utility function 
be given by 


(2) U,( x2, Y23B,Y) = xĝy} -P 


for y = H,T. 

It can immediately be shown that this 
competitive model, as specified, does not 
possess any fully revealing rational expecta- 
tions equilibrium. The reasoning is as fol- 
lows. Suppose that there were a fully reveal- 
ing REE, in other words, an equilibrium 
with the property that an outsider, by ob- 
serving the market-clearing price, could in- 
fer the precise state (8, y). Then, the forms 
of the utility functions in equations (1) and 
(2) imply that good x is valueless when 
B=0 and that good y is valueless when 


~~ 


8 =1. Consequently, p(0,H) [i.e., the price 
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when the state is (0,H)] would equal zero, 
p(i,H) would equal one, and the “heads 
branch” of states would occupy all prices 
between. Similarly, p(0,T) would equal zero, 
p(1,T) would equal one, and the “tails 
branch” of states would occupy all prices 
between. Now choose any price ¢@ such that 
O<@<1. Then ¢ would be the price asso- 
ciated with two states [(8,H) and (8’,T)], 
contradicting the hypothesis that an out- 
sider could infer the precise state from ob- 
serving the market-clearing price. 

The argument of the previous paragraph 
further suggests that equilibria of this com- 
petitive model will be pairwise revealing 
(i.e., the outsider should be able to infer 
from price that the state is one of exactly 
two possibilities). Indeed, one can prove 
using a variation of theorem 5 in Ausubel 
(1990) that any REE of this model is neces- 
sarily characterized by a monotone continu- 
ous price function* which, moreover, is 
pairwise revealing.” Taking this fact as 
given, I will now provide existence and 
uniqueness results by direct construction. 

Observe that the representative insider 
has full information. For a given price @, let 
w,=@x,+U-—¢)y, denote the insider’s 
wealth. Then, in any state (8, y), the insider 
seeks to maximize U,(x,, y1; 8, y) subject to 
the budget constraint ġx;, +(1— ġ)y; < w}. 
Since utility is Cobb-Douglas, the insider’s 
demand for each good is given by his wealth, 
divided by the good’s price and multiplied 
by the exponent to which the good’s con- 


21t is easy to see that, within the class of monotone 
and continuous REE price functions, only pairwise- 
revealing equilibria are possible. Consider any price 
function, p(-,y) that is monotone and continuous for 
each of y =H and T. As in the main text, p(0,H)= 
p(0,T)=0 and p(1,H)= p(1,T)= 1; therefore for ev- 
ery (0 < <1), there exists unique 6/0 < 8 <1] and 
unique a(fB)[0 < a{B)<1] such that p(8,H)= 
p(a(B),T)= ¢. 

The reasoning behind theorem 5 of Ausubel (1990) 
establishes that, within the class of (Borel measurable) 
REE price functions, only pairwise-revealing equilibria 
are possible. Borel measurability should be considered 
part of the definition of rational expectations equilib- 
rium. Existence is treated more generally in theorem 2 
and corollary 1 of Ausubel (1990). 
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sumption is raised: 


(3) «(6,%,, 958.7) =[¥/b]a,(B) 
y,(¢,%,,9;3B,y) 


=[w, /(1— )][1- @,(8)]. 


Define a( 8) = a7 ay (B) = BY" /*7. Using 
the fact that a,,(8) = a_(a(B)), it is easy to 
see that if the price @ is identical in states 
(B,H) and (a(f),T), then an insider will 
display identical demands in those two 
states. 

Now suppose that equilibrium price @ is 
uniquely associated with the states (8,H) 
and (e(8), T). Then, upon observing price 
ġ, any outsider cannot determine which of 
these two statés has actually occurred. It is 
tempting to conclude that the conditional 
probabilities to place on (8, H) and (a(p), T) 
should merely equal the prior probabilities 
of H and T, respectively. However, this intu- 
ition is misleading and ignores the fact that 
the two branches of the price function typi- 
cally have different slopes at a given price 
observation. Hence, the observation confers 
additional information. To gain a better in- 
tuition for the conditional probabilities, it is 
helpful to refer to Figure 2. In the Ap- 


= p(B, H) 
= p(a(B), T) 
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pendix, it is formally demonstrated that 


(4) (B) 
=Pr[(B,7) =(B, p(B, 7) =p(8,)] 


E h 
~ h+[1—-A]a'(B) 


is the correct conditional probability at- 
tached to (8,H), and therefore, 1— (£) is 
the conditional probability attached to the 
state (a(B),T). A representative outsider 
thus determines his demands x ($, X», Yz; 
B,H) and y.(¢,%,,9,;8,H) for the two 
goods in state (8, H) by solving 


(5) max{ 7(B)U,(x2, Y2; B,H) 
+ [1—-(B)]U;(x,, y9;a(B),T)} 


subject to 6x,+(1—-¢) y,<%,+(1-4) 9, 


which equally determines his demands x,(¢, 
Xo, Yo; a(B),T) and y(¢, X2, F2; a(B),T) in 
state (a(8),T). This maximization problem 
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yields 
(6) x2($,%,¥2;8,H) 
= X2(, 2, 723a (B), T) 
=[W2/@l{7(B)B +[1— 7(B)Ja(B)} 
Yo(b,X2,¥2; B,H) 
= ¥2(¢,X2,¥23a(B),T) 
=[w2/G— 6)l{r(B)l1— B] | 
+- rU- etp) 


where w, = Q, +(1— $). It may assist 
the intuition to observe that {r(8)8 +[1— 
mP)le(B)} and {7(p)l1—- 81+0 -— r(p)] 
x{1— a(B8)]} are, respectively, the expected 
values of the exponents to which the con- 
sumptions of goods x and `y are raised in 
the outsider’s Cobb-Douglas utility func- 
tion. Hence, similar to (3), an outsider’s 
demand for each good is-given by his wealth, 
divided by the good’s price and multislied 
by the expected value of the exponent to 
which the good’s consumption is raised. 

Suppose that there is a function p(,y) 
from states of the world to prices that is 
continuous and strictly monotone in 6, with 
p(0,H) = p(0,T) and p(1,H)= p(1,T): Then 
for every ¢ satisfying p(0,H) < ¢ < pC, H), 
precisely two states are associated with œ. 
We will define a pairwise revealing rational 
expectations equilibrium to be a price func- 
tion of this form, together with demand 
functions from equations (3) and (6) that 
satisfy 


(7) x,(0(B,y), i93 By) 
+ x>( p(B, y), ž22;ß,y)= ï+ Fy 


for every B (0 < B <1) and y =H,T; that is 
to say, agents optimize using the correct 
rational expectations inference, and mar- 
kets always clear.” 


Observe that, if the market for good x clears in 
every state, then by Walras’ Law, the market for good 
y must also clear in every state. 
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It is straightforward to solve for a closed 
form of the price function. Substituting (3) 
and (6) and the definitions of w, and w, 
into (7) and solving for (1— 4)/¢ Ge., for 
[1~ p(B, H)I/ p(B, H)) yields 


1 —p(B, H) 
p(8,H) 


„ Fill an(S)) t-Te- -reele 
¥,{o1(8)} +¥2{7(8)8+[1-7(8)]a(B)} 


(8) y(8)= 


where w(-) is given by equation (4). It is 
easy to see that the price function implied 
by &(-) [in eg. (9), below], is always continu- 
ous and satisfies p(0,H) = p(0,T) and 
pQ,H) = pU,T). If yC) is also a strictly 
monotone function, then we have con- 
structed a pairwise revealing REE. This es- 
tablishes the following theorem. 


THEOREM 1: If ‘period one results in en- 
dowments X 1,2; Yy and ¥, such that (-) 
of equation (8) is strictly monotone in B, then 
period two has a unique rational expectations 
equilibrium. In this event, the price function 
is pairwise revealing and is given by 


(9) p(B.) 


“(hs P(B). 
1/[1+ ¥(a~*(g))] 


ify=H 
ify =T 


where a (B) = BYT/#H, 
If w(-) is not monotone, then period two 


` does not possess any REE. 


IV. The Investment Stage When Insider 
Trading Is Permitted 


The first period is modeled quite simply. 
Agents individually decide how much labor 
to invest in producing endowment for the 
second period. At the time of their deci- 
sions, agents know whether they will be 
insiders or outsiders in the second period 
but do not yet. possess any private informa- 
tion (and so they apply the prior distribu- 
tions on B and ¥). To simplify the subse- 
quent analysis, assume that insiders can only 
produce good x and outsiders can only pro- 
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duce good y. The disutility of labor for a 
representative insider producing x units of 
endowment is given by L,(x), and for a 
representative outsider producing y units of 
endowment is given by L(y). We assign 
LC) and L,(-) the following functional 
forms: 


(10) L,(x)=@,x" and L(y) =o,y"? 


w,>0, p;>1 (i=1,2). 


The solution concept for the two-stage 
game will essentially require the play to be 
a Nash equilibrium in the first period and a 
rational expectations equilibrium in the sec- 
ond period. Sequential rationality is im- 
posed in the sense that agents in the first 
period compute payoffs assuming equilib- 
rium play in the second period (Qi.e., the 
solution is required to be a backward induc- 
tion equilibrium). 

As indicated above, insiders (and out- 
siders) are indexed by the unit interval. 
Hence, if all representative insiders (out- 
siders) decide in period one to produce <x, 
(F), then their aggregate endowment enter- 
ing period two equals xX, (7). Moreover, 
any individual agent’s investment decision 
has absolutely no effect on aggregate en- 
dowment (nor on the REE price function), 
and so he takes aggregate endowments (and 
the resulting REE price function) as given 
when selecting his investment. 

In order to state the first-period optimiza- 
tion problems, expressions are needed for 
the second-period payoffs from individual 
choices of endowment when aggregate en- 
dowment is expected to equal (X,, J2). Let 
V(x) (Cy) denote the ex ante expected 
utility—excluding the L,(-) term—to a rep- 
resentative insider (outsider) who has car- 
ried forward x (y) units of endowment into 
period two, before learning anything about 
the state. Formulas for V(x) and V,(y) are 
derived in equations (A2)-(A5) of the Ap- 
pendix. 

Now let X,(x,, ¥,) signify the optimal en- 
‘dowment for an insider to produce individu- 
ally in the first period if he expects aggre- 
gate endowment in the second period to 
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equal (X,, Ya). Then X,(X,, ¥,) solves 


(11) L(x) +V,(x)} 


max {— 


where V,(-) is derived using the price func- 
tion from aggregate endowments (4, y,). 
Similarly, let Y,(%,, J2) signify the optimal 
endowment for an outsider to produce un- 
der these circumstances. Then Y,(%,, Y2) 
solves 


(12) max {— L(y) + P(y)} 


I will say that (%,,¥,) corresponds to a 
competitive equilibrium of the two-stage game 
if, given that the aggregate endowments will 
be (X,, 2), each representative insider opti- 
mizes by producing x, and each representa- 
tive outsider optimizes by producing y,. In 
the above notation 


(13) x, = X,(x,, a) 
and 


Ya = Y2(%1, 92). 


This notion of competitive equilibrium re- 
quires that (%,,),) determines a strictly 
monotone function u(-) in equation (8). 
Otherwise, there does not exist any REE in 
the second stage (see Theorem 1). I will say 
that (X,,¥,) corresponds to a candidate 
competitive equilibrium if each of the prereq- 
uisites for competitive equilibrium is satis- 
fied, except possibly the monotonicity re- 
quirement. 

The following theorem is proved in the 
Appendix. 


THEOREM 2: The two-stage model in which 
insider trading is permitted has a unique can- 
didate competitive equilibrium. 

Let aggregate endowments in the candidate 
equilibrium be given by (Xj, Y2). If the func- 
tion Ws -) implied by (X,, ¥,) in equation (8) 
is monotone decreasing in B, then the two- 
stage model has a unique competitive equilib- 
rium. If the function w-) implied by (x,, Ya) 
is not monotone, then there does not exist any 
competitive equilibrium. 


t 


t 
$ 
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V. Modification of the Model When Insider 
Trading Is Banned 


In this section, I modify the two-period 
model that has been thus far examined in 
order to discuss the effects of the abstain- 
or-disclose requirement that have already 
been seen in Sections I and H. As said 
before, the representative insider is ac- 
corded two options in the second period: 
either to abstain from trading (i.e., to con- 
sume precisely his endowment brought for- 
ward from the first period) or to disclose the 
state before trading. I now make three addi- 
tional modeling assumptions about the dis- 
closure technology. First, any disclosure (if 
made) is constrained to be complete and 
truthful. Second, while disclosure is costless, 
insiders lexicographically prefer “not dis- 
closing” to “disclosing” (all other things be- 
ing equal). Third, outsiders directly learn 
the true state if and only if an interval (of 
positive length) of insiders choose to dis- 
close.”° 

These additional assumptions on the dis- 
closure technology exclude the unattractive 
possibility of equilibria in which insiders 
voluntarily disclose their information in the 
“insider trading permitted” regime. With- 
out the lexicographic preference toward 
nondisclosure, one insider might disclose 
merely because one or more other insiders 
were also disclosing. (If he unilaterally devi- 
ated by not disclosing, there would still be 
other agents disclosing, and so his payoff 


This is merely the limit, as £ | 0, of considering a 
positive cost € of disclosure. 

To be precise, we assume that outsiders directly 
learn the true state if and only if a strictly positive 
measure of insiders chooses to disclose. An assumption 
along these lines is necessary to make the disclosure 
stage consistent with the notion that agents are “infor- 
mation takers” (i.e., that any agent should take the 
market information as-given because his individual 
actions have no effect on the information available in 
the marketplace). This notion is implicit in competitive 
rational expectations equilibrium and is used through- 
out the paper. It would be fairly bizarre to require that 
a set of measure zero of insiders have absolutely no 
effect on the information available in the market in the 
trading and investment rounds, but then to allow that a 
set of insiders of measure zero could choose to fully 
inform the market in the disclosure stage. 
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would be unchanged, rendering the devia- 
tion unprofitable.) With the additional as- 
sumptions, insiders do not disclose their 
private information unless they are required 
to do so in order to trade. (Whether or not 
one single insider chooses to disclose does 
not change whether zn interval of positive 
length has disclosed and, hence, does not 
change the trading outcome.) At the same 
time, there is no inference for outsiders to 
draw from the fact that insiders have not 
disclosed their information, except that in- 
siders were not required to disclose (cf. 
Grossman, 1981). 

In the present model, insiders possess an 
independent reason for wishing to trade in 
the asset in question: they wish to acquire 
the commodity y from outsiders. (Observe 
that, quite generally in a model of this type, 
insiders will prefer to disclose rather than 
abstain. Under essentially any set of prices, 
the insider can attain strictly higher utility 
by trading than by abstaining from trade.) 
The abstain-or-disclose regulation therefore 
induces the informed agents to reveal their 
information, rendering the trading round an 
exchange economy with full information. 

Thus, the bottom line of the subsequent 
analysis will be to compare a full-informa- 
tion equilibrium (when. insider trading is 
banned) to that of a partially revealing REE 
(when insider trading is permitted) in the 
second round. However, the conclusion is 
likely to differ from that of standard com- 
parisons of full-information versus partial- 
information economies, because the 
second-round equilibrium feeds into the de- 
termination of the first-period outcome. 
Given exogenous endowments (¥}, >), in- 
siders would typically do better when per- 
mitted to trade on their private information 
than in the full-information economy; they 
would earn trading profits at the expense of 
outsiders, who typically do worse when in- 
sider trading is permitted. However, since 
endowments are actually endogenous, the 
welfare comparison may be more compli- 
cated than (and different from) this straight- 
forward result. 

The analysis of the model when insider 
trading is banned parallels the analysis of 
Sections II] and IV (where insider trading 
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was permitted), but the problem now be- 
comes easier. As the abstain-or-disclose 
regulation induces insiders to reveal their 
information before they trade, outsiders now 
form full-information demands, which are 
given by 


(6) x¥(6,%.,523B,y) =lw2./¢]B 
yF(,X2, Ya; B, Y) = [w./(1- ¢)|[1- B] 


whereas insiders’ demands, xj* and yř, are 
still described by (3). Solving xř + xž =x, 
+ Xx, now yields 


1— p*(B,H) 
p*(B,H) 


_ ¥{1- a,(8)}+ ¥,{1- B} 
¥{a,(B)} + P8} 


immediately providing a closed form for 
p*(B,y). The ex post utility functions, V;* 
and V,*, are now calculated from (A2) 
and (A3) using p*,x#, yi, x¥, and y#. The 
ex ante expected utility functions, V,* and 
V>*, are calculated analogously as in (A4) 
and (A5). Optimal investment functions, 
X;*(, +) and Y,*(-, -), can be defined analo- 
gously as in equations (11) and (12). Any 
(xi*, 9#) is a competitive equilibrium if 


(8) ¥*(B) = 


(13') XP = Xr (žr, IF) 


and 


It is straightforward to modify the argument 
in the Appendix and prove that the ratio of 
endowments, z* = py} /x;*, in a competitive 
equilibrium corresponds to the unique fixed 
point of a particular mapping. It is no longer 
necessary to worry about whether the result- 
ing price function is monotone, as agents 
are no longer required to draw any infer- 
ences from price. Existence and uniqueness 
of equilibrium are assured when insider 
trading is banned. This establishes the fol- 
lowing theorem. 
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THEOREM 3: The two-stage model in which 
insider trading is banned has a unique com- 
petitive equilibrium. 


VI. Welfare Implications of Insider Trading 
Regulation in the Model 


Theorems 2 and 3 are very easy to apply 
to examples. While one cannot calculate 
closed-form solutions, the model is suffi- 
ciently simple and well-behaved that numer- 
ical approximations to the unique equilibria 
can be rapidly calculated on a microcom- 
puter. Computations are best done using 
the reduction from endowment pairs (X,, Y4) 
to ratios Z=y,/x, developed in the Ap- 
pendix. 

Let Ut; ER ),U,C, fe ty -), dyl), arC), 
etc. be specified as in previous sections. Let 
w =w, =w and pı =p, =P, so that insid- 
ers and outsiders are assigned essentially 
identical disutilities of labor (in an attempt 
to treat them symmetrically and to avoid 
predetermining the conclusion).7’ 

I report now the quantitative results of an 
illustrative simulation and the qualitative 
results when the parameter values from this 
simulation are varied. Let w,, = 2/5 and let 
pep = 5/2. Set the probability k equal to 
1/2. Meanwhile, set œ = 1/10 and p=5/4. 
By Theorem 2, the model in which insider 
trading is permitted has a unique candidate 
equilibrium; the ratio of endowments is cal- 
culated to equal z= 0.882410. It is easily 
verified that the resulting price function in 
equations (8) and (9) is monotone, so there 
is indeed a unique competitive equilibrium. 
By Theorem 3, the model where insider 
trading is banned has a unique competitive 
equilibrium; a (simpler) computation finds 
that the ratio of endowments is z*= 
0.947958. As described in the Appendix, 
these ratios immediately determine the en- 
dowment pairs under each of the two regu- 


27 By giving insiders the same investment incentives 
as outsiders (as opposed to making insiders’ invest- 
ments inelastic) and by formulating the model so that 
there are “as many” insiders as outsiders, one makes it 
quite plausible that banning insider trading would re- 
duce aggregate investment (since, while outsiders would 
invest more, insiders would seemingly invest less). 
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TABLE 1—INVESTMENTS AND EXPECTED UTILITES . 
_ YIELDED BY THE SIMULATION 
(w =1/10, p=5/4,h=1/2, 
Hu = 2/5, and u, = 5/2) 


INSIDER TRADING PERMITTED: 


Investment of Insiders.. ; 289.339 
Investment of Outsiders 255.315 
Expected Utility of Insiders 29.833 
Expected: Utility of Outsiders 25.514 
INSIDER TRADING BANNED: 
Investment of Insiders 295.195 
Investment of Outsiders 279.833 
Expected Utility of Insiders 33.590 
Expected Utility of Outsiders 23.613 


latory regimes considered. The simulation 
yields the investments and expected utilities 
given in Table 1 [where utility now includes 
both U,(-) and — L(-), calculated in double 
precision]. The price functions that arise in 
the unregulated and regulated regimes are 
plotted in Figures 3 and 4, respectively. 

The qualitative conclusions are as follows. 
First and most importantly, the banning of 
insider trading may indeed effect a Pareto 
improvement. In this example, it raises the 
represéntative insider’s utility by 2.5 ‘percent 
and raises the representative outsider’s util- 
ity by 12.1 percent. Second, the principal 
. mechanism by which this occurs is that the 
greater return on investment in the regu- 
lated regime induces greater investmert by 
the outsiders. This, in turn, improves re- 
turns to insiders and actually induces greater 
investment by the insiders as well. Third, if 
the trading round had been examined in 
isolation, we would have instead found that 
the insider trading ban was merely redistri- 
butional; if agents had entered the trading 
stage with the (unregulated) investments of 
289.339 and 255.315, but the abstain-or-dis- 
close rule was now imposed, insiders would 
earn expected utilities of only 27.085, while 


outsiders would’ earn expected utilities of 


31.779. Fourth, it should be noted that, in 
this particular model, insider trading regula- 
tion actually improves market efficiency as 
well; the outsiders attain full information 
strictly sooner. 

Suppose that, - starting from the parame- 


ter values of the above example, w were to- 


unilaterally vary anywhere in the domain 
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PRICE 
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FIGURE 3. EQUILIBRIUM Price FUNCTION WHEN 
INSIDER TRADING Is PERMITTED (PLOTTED FOR 
@=1/10, p = 5/4, h =1/2, py =2/5, AND 
ur = 5/2) 


PRICE 





FIGURE 4. EQUILIBRIUM. PRICE FUNCTION WHEN 
INSIDER TRADING Is BANNED (PLOTTED For 
w=1/10, p=5/4,h=1/2, py = 2/5, AND 

Bp =35/ 2) 


0 <w <%, It is interesting that, for any such 
variation, the welfare implication that regu- 
lating insider trading effects a Pareto im- 
provement is qualitatively preserved. This 
conclusion is also robust to unilateral per- 
turbations of p anywhere in 1<p<o, (If 
p <1, the strict convexity of L,(-) is lost.) 
A wider range of welfare conclusions is 
obtained by instead simultaneously perturb- 
ing the exponents uy and uyr. In particular, 
the welfare implication that insider trading 
regulation effects a Pareto improvement 
holds only for some combinations of param- 


_ eter values. For other combinations, insider 


trading regulation will help outsiders but 
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A Pareto improvement is effected 
when insider trading is banned. 





l Outsiders are helped but insiders are 
harmed when insider trading is 
banned. 


A No competitive equilibrium exists 
for these parameter values. 


FIGURE 5. WELFARE CONSEQUENCES OF INSIDER 
TRADING FOR A GRID OF EXPONENTS 
(w =1/10, p=5/4, AND h=1/2) 


harm insiders. (Other welfare implications 
are logical possibilities, but I have not found 
any examples that lead to these possibilities.) 
In Figure 5, the welfare implications are 
calculated on a grid (of width 0.1) of values 
from the square {(44,,7):0<puy<5 and 
0 < ur <5}. On most of the square, insider 
trading regulation indeed helps both out- 
siders and insiders. In a small region near 
the origin, outsiders are helped but insiders 
are harmed; however, when uy = Hr, insid- 
ers and outsiders have identical prefer- 
ences, the price function is the same regard- 
less of whether insider trading is permitted 
or banned, and so regulation neither helps 
nor harms anyone. In two remaining small 
regions (the symmetrically arranged slivers), 
the last sentence of Theorem 2 comes into 
play: the unique candidate price function 
y(-) is nonmonotone, and therefore, no ac- 
tual competitive equilibrium exists. 
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FIGURE 6. WELFARE CONSEQUENCES OF INSIDER 
TRADING FOR A RANGE OF PROBABILITIES 


Additional insight is obtained by unilater- 
ally varying A in the interval 0 < h <1. One 
would expect that the economic conse- 
quences of changing insider trading regula- 
tions would be weakest when the infor- 
mational asymmetry between insiders and 
outsiders is least (i.e., when A is near 0 or 1) 
and would be strongest when the informa- 
tional asymmetry between insiders and out- 
siders is greatest (i.e., when A assumes in- 
termediate values). This prediction is borne 
out in Figure 6, where I have perturbed the 
parameter h and calculated the percentage 
gain in ex ante expected utility for each 
class of agent. Insiders achieve the greatest 
percentage gain when A = 0.285, while out- 
siders achieve the greatest percentage gain 
when h=0.44. When h is near 0 or 1, 
welfare effects are fairly negligible. A Pareto 
improvement is effected when h < 0.74; for 
h> 0.74, insiders would do better in the 
equilibrium in which insider trading was 
permitted, if one in fact existed. However, 
for h > 0.68, monotonicity of the candidate 
price function fails, and therefore, no com- 
petitive equilibrium exists. 

If I had wished to write this article as a 
polemical piece, I might have emphasized 
another example, with uy =1/3, uw, =3, 
h=1/4, w =1/2, and p=1.01. With such 
parameter values, the effect of banning in- 
sider trading is to increase insiders’ and 
‘outsiders’ levels of investment each by a 
factor of five. Utilities of both types of agents 
are also increased by a factor of five. 
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VII. Conclusion 


This article has attempted to contribute 
to the economic analysis of insider trading 
by formalizing “confidence” as an efficiency 
argument. If outsiders expect that insiders 
will take advantage of them at later stages, 
then outsiders may choose to invest less at 
the beginning. Meanwhile, effective regula- 
tion of insider trading at later stages may 
improve the anticipated return on invest- 
ment of outsiders and, hence, promote in- 
vestment by outsiders at the beginning. If 
insiders are helped by the availability of 
outside investment, insiders too may benefit 
from the precommitment created by insider 
trading regulation. It is noteworthy that the 
efficiency considerations posed by “confi- 
dence” point in exactly the same direction 
as the traditional fairness considerations, 
and for almost the same reason. 

The confidence argument seems most 
likely to be decisive in scenarios where the 
early revelation of information affords little 
scope for any allocative improvement and 
where, in particular, the pertinent private 
information will become public regardless 
of whether trading occurs. For example, if 
investment bankers were permitted to trade 
(for their own personal benefit) shortly in 
advance of the announcement of tender of- 
fers, there would probably be little effect on 
the timing or success of tender offers. The 
dominant effect, ex post, would then be re- 
distributive from outsiders to insiders; we 
should principally ask whether the anticipa- 
tion of this insider trading, ex ante, has 
adverse consequences. Thus, despite the fact 
that my model literally made use of a com- 
modity market rather than a stock ex- 
change, it may still be a reasonable abstrac- 
tion for assessing the desirability of insider 
trading in advance of tender offers.”° 

On the other hand, my model would need 
to be enlarged in order to depict adequately 
a situation in which the revelation of private 
information through trading could play a 


Hence, this article should be viewed as lending 
support to Rule 14e-3 of the SEC, which in a very 
expansive way regulates trading in advance of tender 
offers (see also Section I). 
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positive role in determining whether poten- 
tial investment projects receive funding. 
Similarly, I would need to introduce addi- 
tional features in order to examine a situa- 
tion in which emplovees of an organization 
could be given incentives to behave as en- 
trepreneurs. 

To conclude, the simple structure of the 
model, which made the confidence argu- 
ment relatively transparent, had the cost of 
allowing little other than the confidence ef- 
fect to occur. A valuable next step in assess- 
ing the desirability of insider trading would 
be the construction of richer models that 
have room for both the negative incentives 
posed by confidence and the positive incen- 
tives described by previous authors. Such 
analysis may eventually give us a better un- 
derstanding of the specific types of contexts 
in which insider trading should be permit-. 
ted or banned. 


APPENDIX 


Derivation of the Conditional Probability +(B) 
Used by Quisiders. Suppose the represen- 
tative outsider knew that g < p(B, F) < 4, 
where $y = p(,,ED and $, = p(B, ED. He 
could infer that either Ý = H, in which case 

9 =BSBf,, or else ý =T, in which case 
al Bo) < < B <a(B,). Since B and y are inde- 
pendent, observe that the unconditional 
probability of the first event (Qy in Fig. 2) 
is A[8,— Bol and the unconditional proba- 
bility of the second event (Qy in Fig. 2) is 
{1—hA]la(B,)— alB]. Hence, ‘the condi- 
tional probability of the first event is given 
by 


(Al) Prl¥=H|d,)<p(6,7) <4)| 


Da h[Bı—Bo] 
h[B,~By] +[1—h] [@(B;) —a( Bo) | ) 


The probability that (8,7) =(B,H), condi- 
tional on p(3,7) = p(B, H), is calculated by 
setting B= B and taking the limit as B, > 
Bo of the expression in (A1). Using 
’H6pital’s rule, this yields equation (4). 


Derivation of the Agents’ Ex Ante Expected 
Utility Functions. Let V(x; B, yY) V0; B, YX 
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denote the ex post. utility attained by a 
representative insider (outsider) who has 
carried forward x (y) units of endowment, 
when the actual state in period two is (8; y). 
Insiders and outsiders apply demands from 
equations (3) and (6),. respectively, giving 
attained utilities of 


(A2) V(x; B, y) 

uE y),1,0;8, 8,7). 

y4( p(B.) 1,03 8,7) 38,71} 
and | 


(A3) V,(y3B,¥) 
| | = y{U2[ x2 p(B, ),0,15 8,7), 


¥2( p(B») ,0,1;8,7)3Bs7]} 


In the above equations, the terms x and y 
factor out in a linear fashion because of the 
homogeneity of degree one of the functions 
UC), xC), and y{-). The ex ante expected 
utilities, V(x) and V,(y),. are calculated by 
merely integrating the ex post utilities over 
all possible states, using the correct uncon- 
ditional probabilities. Using aa and (A3), 
one obtains 


(A4) psa l 
oe 
= x{hf V,(1;8,H) dB 
+A) ['V,(58.7) ap} 


and 


(AS). (y) = y¥(1) 
g. yin [VA B.H) dp 


+A- AS VAGET) ap}. 
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PROOF OF THEOREM 2: 


The rational expectations equilibrium price 
function, as indicated in equations (8) and 
(9), is homogeneous of degree zero in aggre- 
gate endowment (X,, Y2); that is, if aggre- 
gate endowments equaled (X,', 2) such that 
Y, / Xi = Y¥./X,, precisely the same price 
function would result. ‘This follows from the 
fact that the utility functions [of (1) and (2)] 
imply demands [given by (3) and (6)] which 
themselves are homogeneous of degree one 
in endowments. Similarly, observe that 
X{-,-) and Y,C,-), defined in (11) and 
(12), are homogeneous functions of degree 
Zero. 

Since the important features of the model 
depend .only on the ratio of aggregate en- 
dowments, define Z = y, / X, to be the ratio 
of endowments. Also, define the mapping 
T(Z) to yield the optimal ratio of endow- 
ments for insiders and outsiders to produce 
individually in the first period if they expect 
the ratio of aggregate endowments in the 
second period to equal Z. Utilizing the ho- 
mogeneity of degree zero, one sees that 
i ) is given by 


(A6) T(Z)= ¥(1,2)/X,(1, 2). 


I will now establish the existence and 
uniqueness of a fixed point of T(-). 

Consider the trading stage as Z >. Note, 
using (8) and (9), that the price function 
p(-,:) of good .x converges to one, point- 
wise. Therefore, a representative insider 
with endowment x,=1 can afford to pur- 
chase an arbitrarily large quantity of good y 
while ‘still consuming x,=1/2, implying 
V,G) >. Meanwhile, a representative -out- 
sider with endowment y, =1 can barely af- 
ford to purchase any quantity of good x, 
implying V,(1) > 0. Extracting the first-order. 
conditions from equations (11) and (12) and 
substituting from equations (A4) and (A5) 
yields 


(A7) X,(1,2) ={F(1)/p,0,)/"” 


2 ee 1/(p2~D- 
Y,(1,Z) = {V2(1) /p2>} a : 
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and leads to the conclusion that X,(1, Z) => œ 
and Y,(1,z)-—> 0. The definition of T(-) in 
(A6) thus implies lim ze T(Z) = 0. 

Consider the trading stage as Z-—0. 
Analogous reasoning yields that V,()—- 0 
and, hence, X,(1, Z) - 0; similarly, V,(1) > 
and, hence, Y,(1, Z) >. Thus, lim , _, , T(Z) 
=o, This demonstrates that there exist z! 
and z*, where 0 < z! < z?, such that T(z!) 
>z! and T(z*)<z*. Since the mapping 
T(-) is continuous, the intermediate-value 
theorem guarantees the existence of a fixed 
point Z between z! and z’. 

The uniqueness of a fixed point is estab- 
lished by demonstrating that T(-) is mono- 
tone decreasing. Consider any two ratios 
of endowment, Z and Z’, where Z’>z> 0. 
Let p(-,-) and p’(-,-) be the price func- 
tions implied by Z and 2’, respectively, 
using equations (8) and (9). Note, for any 
given state (B,y), where 0<B <1, that 
p'(B, y) > p(B, y). Now observe that when 
the price is p'(B,y), the representative out- 
sider’s demands, x,(p'(B, y),0,1;8,y) and 
y.(p'(B,y),0,1;B,y), are also within the 
outsider’s budget constraint (with some slack 
in wealth remaining) at the lower price 
P(B,y). Hence, V1; B, y), defined in equa- 
tion (A3), is strictly greater at price p(B, y) 
than at price p’(B,y). Since this inequality 
holds whenever,0 < 8 <1, the ex ante ex- 
pected utility V,(1;8,v), defined in equa- 
tion (A5), is also strictly greater at price 
p(B,y) than at price p’'(B,y). Using (A7), 
this establishes that Y,(1, Z) > Y,Q, z’). 
Analogous reasoning for the representative 
insider establishes that X,(1,Z’)> X,(, Z). 
Using (A6), one concludes that T(z’) < T(Z). 
Now suppose there existed two fixed points 
z and ž', where Z’> Z. This would imply 
T(z) — T(z) = z'— Z, contradicting that 
T(-) is monotone decreasing. Therefore, 
T(-) has a unique fixed point Z. 

Finally, observe that there is a one-to-one 
correspondence between candidate compet- 
itive equilibria of the two-stage game and 
fixed points of T(-). The aggregate endow- 
ments (X,,¥,) associated with a candidate 
equilibrium imply a fixed point by z= 
Y» / X4; a fixed point Z yields aggregate en- 
dowments of a candidate equilibrium by 
x,= X,0,Z) and y,= Y,(,2Z). This leads 
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to the conclusion that there exists a unique 
candidate competitive equilibrium, which is 
also an actual competitive equilibrium if 
and only if #(-) is monotone. 


REFERENCES 


Allen, Beth, “Generic Existence of Com- 
pletely Revealing Equilibria for 
Economies with Uncertainty when Prices 
Convey Information,” Econometrica, 
September 1981, 49, 1173-99. 

Ausubel, Lawrence M., “Partially-Revealing 
Rational Expectations Equilibrium in a 
Competitive Economy,” Journal of Eco- 
nomic Theory, February 1990, 50, 93—126. 

Bagwell, Laurie S., “Share Repurchase and 
Takeover Deterrence,” Northwestern 
University, Department of Finance, 
Working Paper No. 53, July 1988. 

, “Dutch Auction Repurchases: An 
Analysis of Shareholder Heterogeneity,” 
Northwestern University, mimeo, August 
1989, 

and Judd, Kenneth L., “Transaction 
Costs and Corporate Control.” North- 
western University, mimeo, November 
1988. 

Balcer, Yves and Judd, Kenneth L., “Effects of 
Capital Gains Taxation on Life-Cycle In- 
vestment and Portfolio Management,” 
Journal of Finance, July 1987, 42, 743-61. 

Benabou, Roland and Laroque, Guy, “Using 
Privileged Information to Manipulate 
Markets: Insiders, Gurus, and Credibil- 
ity,” Massachusetts Institute of Technol- 
ogy, mimeo, February 1989. 

Bhattacharya, Uptal and Spiegel, Matthew, “In- 
siders, Outsiders and Market Break- 
downs,” Columbia University, mimeo, 
October 1989. 

Brudney, Victor, “Insiders, Outsiders, and In- 
formational Advantages Under the Fed- 
eral Securities Laws,” Harvard Law Re- 
view, December 1979, 93, 322-76. 

Carlton, Dennis W. and Fischel, Daniel R., “The 
Regulation of Insider Trading,” Stanford 
Law Review, May 1983, 35, 857-95. 

Dennert, Jurgen, “Insider Trading and the 
Allocation of Risks,” Universitat Basel, 
mimeo, October 1989. 

Diamond, Douglas W. and Verrecchia, Robert E., 


VOL. 80 NO. 5 


“Information Aggregation in a Noisy Ra- 
tional Expectations Economy,” Journal of 
Financial Economics, September 1981, 9, 
221-35. 

Dye, Ronald A., “Inside Trading and Incen- 
tives,” Journal of Business, July 1984, 57, 
295-313. 

Easterbrook, Frank H., “Insider Trading, Se- 
cret Agents, Evidentiary Privileges, and 
the Production of Information,” Supreme 
Court Review, 1981, 11, 309-65. 

Fishman, Michael J. and Hagerty, Kathleen M., 
“Insider Trading and the Efficiency of 
Stock Prices,” Northwestern University, 
Department of Finance, mimeo, April 
1989. 

Gale, Douglas and Hellwig, Martin F., “In- 
formed Speculation in Large Markets,” 
University of Pittsburgh, mimeo, August 
1987. ° 

Grossman, Sanford J., “The Informational 
Role of Warranties and Private Disclo- 
sure about Product Quality,” Journal of 
Law and Economics, December 1981, 24, 
461-83. 

and Stiglitz, Joseph E., “On the Im- 
possibility of Informationally Efficient 

. Markets,” American Economic Review, 
June 1980, 70, 393—408. 

Haddock, David D. and Macey, Jonathan R., 
“Regulation on Demand: A Private Inter- 
est Model, with an Application to Insider 
Trading Regulation,” Journal of Law and 
Economics, October 1987, 30, 311-52. 

Jaffe, Jeffrey F., “Special Information and 
Insider Trading,” Journal of Business, July 
1974, 47, 410-28. 

Jordan, James S., “On the Efficient Markets 
Hypothesis,” Econometrica, September 
1983, 57, 1325-44. 

Kihlstrom, Richard E. and Postlewaite, Andrew, 
“Equilibrium in a Securities Market with 
a Dominant Trader Possessing Inside In- 
formation,” University of Pennsylvania, 
mimeo, June 1983. 


AUSUBEL: INSIDER TRADING 1041 


Kyle, Albert S., “Continuous Auctions and 
Insider Trading,” Econometrica, Novem- 
ber 1985, 53, 1315-35. 

Laffont, Jean-Jacques and Maskin, Eric S., “The 
Efficient Market Hypothesis and Insider 
Trading on the Stock Market,” Journal of 
Political Economy, February 1990, 98, 
70-93. 

Langevoort, Donald C., Insider Trading Regu- 
lation, New York: Clark Boardman, 1990. 

Lorie, James H. and Niederhoffer, Victor, “ Pre- 
dictive and Statistical Properties of In- 
sider Trading,” Journal of Law and Eco- 
nomics, April 1968, 77, 35-53. 

Manne, Henry G., (1966a) Insider Trading and 
the Stock Market, New York: Free Press, 
1966. 

, (1966b) “In Defense of Insider 
Trading,” Harvard Business Review, 
November—December 1966, 44, 113-22. 

Manove, Michael, “The Harm from Insider 
Trading and Informed Speculation,” 
Quarterly Journal of Economics, Novem- 
ber 1989, 104, 823-45. 

Posner, Richard A., Economic Analysis of Law, 
Boston: Little, Brown & Co., 1986. 

Radner, Roy, “Rational Expectations Equi- 
librium: Generic Existence and the Infor- 
mation Revealed by Prices,” Economet- 
rica, May 1979, 47, 655-78. 

Scott, Kenneth E., “Insider Trading: Rule 
10b-5, Disclosure, and Corporate 
Privacy,” Journal of Legal Studies, De- 
cember 1980, 9, 801-18. 

Seligman, Joel, “The Reformulation of Fed- 
eral Securities Law Concerning Nonpub- 
lic Information,” Georgetown Law Jour- 
nal, April 1985, 73, 1083-1140. 

Seyhun, H. Nejat, “Insiders’ Profits; Costs of 
Trading, and Market Efficiency,” Journal 
of Financial Economics, June 1986, 16, 
189-212. 

Shleifer, Andrei, “Do Demand Curves for 
Stocks Slope Down?”, Journal of Finance, 
July 1986, 47, 579—90. 


Optimal Bypass and Cream Skimming 


By JEAN-JACQUES LAFFONT AND JEAN TIROLE” 


This paper develops a normative model of regulatory policy toward bypass and 
cream skimming. It analyzes the effects of bypass on second-degree price discrimi- 
nation, on the rent of the regulated firm, and on the welfare of low-demand 
customers. It shows that pricing under marginal cost may be optimal for the 
regulated firm, excessive cream skimming occurs if access to the bypass technol- 
ogy is not regulated, and the prohibition of bypass may increase or decrease the 


regulated firm’s rent. (JEL 026, 613) 


The regulation of “natural monopolies” 
is often associated with policies toward 
competition, including restrictions on entry. 
This paper is concerned with a common 
form of competition, which threatens the 
regulated firm on its most lucrative markets. 
Examples abound: in the telecommunica- 
tions industry, the development of mi- 
crowave radio and communication satellites 
in the 1960’s introduced the possibility that 
big telecommunication customers might by- 
pass the major common carrier (AT&T) and 
deal directly with a satellite company. for 
instance. More recently, some large firms 
have bypassed the local telephone networks 
and have acquired direct links to long-dis- 
tance carriers. Similar issues arise in the 
energy sector. Big industrial consumers of 
electricity may generate their own power; 
and the 1978 Natural Gas Policy Act in the 
United States has created the possibility for 
industrial plants to bypass the local distribu- 
tion utilities by building direct connections 
to the pipelines, gas producers, or interme- 
diaries. 

What distinguishes these examples from 
other situations in which a regulated firm 
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The authors are grateful to the Ford Foundation, the 
Pew Charitable Trust, the Guggenheim Foundation, 
the Center for Energy Policy Research at MIT, the 
National Science Foundation, and the French Min:stére 
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faces competition is that the competitive 
pressure focuses on the high-demand cus- 
tomers (the “cream”) and not on low- 
demand ones (the “skimmed milk”). That 
is, entry interferes with second-degree price 
discrimination by the regulated monopolist.’ 

As one would expect, cream-skimming has 
been the object of much regulatory atten- 
tion. Regulated monopolies have repeatedly 
called for entry restrictions. For instance, 
AT&T has assailed MCI as a cream skim- 
mer, lapping up the profits on favorable 
routes and eschewing high-cost low-return 
service, and has accused Comsat of syphon- 
ing the most profitable part of the business. 
More recently, local distribution companies 
have made similar charges against bypass. 
In both cases, the regulated monopoly has 
argued that, because of economies of scale, 
bypass would raise the rates of small-volume 
commercial and residential users or would 
reduce the quality of their service. Histori- 
cally, the case for restriction of entry into 
the market of a natural monopoly has often 
been supported on such cream-skimming 


1Cream skimming is also often discussed in the 
context of a multiproduct firm when some of the regu- 
lated firm’s most valuable products are skimmed off by 
competitors. The case of third-degree price discrimina- 
tion is simpler to analyze than that of second-degree 
discrimination, tecause different customers are offered 
different terms and therefore there are no incentive 
constraints of consumers to satisfy; see our previous 
paper (Laffont and Tirole, 1990b) for an analysis of 
some issues concerning the interaction of a multiprod- 
uct firm with its competitors. 
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grounds. Conversely, some have held the 
view that bypass is the outcome of a healthy 
competition and can only result in efficiency 
gains when it occurs. For instance, the Fed- 
eral Energy Regulatory Commission has ar- 
gued against the denial of certificates to 
competitors on the basis that local distribu- 
tion companies are in a position to compete 
aggressively.” 

This paper develops a normative model 
of regulatory policy toward bypass and 
cream skimming. It posits a double asymme- 
try of information. First, the regulated firm 
is ignorant of the demand characteristics of 
individual customers and must thus practice 
second-degree price discrimination.? There 
are two types of customers: “high-demand” 
and “low-demand.” The high-demand cus- 
tomers have the opportunity of using an 
alternative, competitively supplied technol- 
ogy. The fixed cost associated with this al- 
ternative technology (e.g., the cost of build- 
ing an interconnection or a generator) makes 
bypass unattractive to low-demand cus- 
tomers. Second, the regulated firm knows 
more about its own technology than the 
regulator. The firm’s rent is affected by the 
policy toward bypass. Although the cost 
function is assumed to be linear in total 
output, it exhibits increasing returns to scale 
as a bigger output makes reductions in 
marginal cost more desirable. The regulator 
chooses the pricing, cost reimbursement, 
and possibly bypass policies to maximize 
social welfare. 

The paper’s technical contribution is 
twofold. First, the theory of nonlinear pric- 
ing has focused on “downward binding” in- 
centive constraints; that is, a monopolist 
must design a pricing scheme that prevents 


*See Alfred Kahn (1971 chapters 1,4,6) for a dis- 
cussion of the economic arguments in favor of and 
against cream skimming. 

Imperfect information about consumers may justify 
cross-subsidization for redistributional purposes (see 
Laffont and Tirole, 1990a). A normative analysis of 
entry in such circumstances could be carried out along 
the lines of this paper. Preventing entry may facilitate 
redistribution in the same way it may facilitate second- 
degree price discrimination here. 
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high-demand customers from consuming the 
low-demand customers’ bundle (see Eric 
Maskin and John Riley, 1984; Michael 
Mussa and Sherwin Rosen, 1978). In the 
presence of bypass, the monopolist may have 
to offer advantageous terms to high-demand 
customers in order to retain them. This 
may lead low-demand customers to consume 
the high-demand customers’ bundle, even 
though they would not use the bypass 
technology. The paper studies the effect of 
“upward binding” incentive constraints.* 
Second, the bypass technology introduces 
discontinuities in the control of the regu- 
lated firm. We show how to deal with such 
discontinuities and extend results obtained 
in regulatory models with more conven- 
tional net-demand functions, in particular 
the linearity of optimal cost-reimbursement 
rules (see Laffont and Tirole, 1986). 

The economic contribution is a welfare 
analysis of cream skimming. We ask: a) 
whether asymmetric information between 
the regulator and the regulated firm in- 
creases the amount of bypass; b) how, in a 
situation of asymmetric information, the 
regulator can ask the firm to substantiate its 
claim that bypass should either be prohib- 
ited or prevented through price cuts; c) 
whether a marginal price under marginal 
cost is an appropriate response to the threat 
of bypass; d) whether low-demand cus- 
tomers are hurt by the possibility of bypass; 
e) whether there is socially too much or too 
little bypass; and f) whether the regulated 
firm is necessarily hurt by the possibility of 
bypass. Section I describes the model, Sec- 
tion II derives the optimal regulatory 
scheme, and Section III summarizes our 
main findings.” 


‘For similar considerations, see Paul Champsaur 
and Jean-Charles Rochet (1989) for a duopoly model 
of competition in quality and prices, as well as Tracy 
Lewis and David Sappington (1989) for countervailing 
incentives in regulation. Incentive constraints may also 
be “binding upwards” (but for another reason) in 
dynamic incentive problems without commitment (see 
Laffont and Tirole, 1987). 

“Bernard Caillaud (1985) studies the effect of an 
unregulated competitive fringe on the regulation of a 
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I. The Model 


The regulated firm serves two. types of 
consumers (i= 1,2), in numbers a, and a,. 
Let q, and q, be the consumptions of type-1 
and type-2 ‘consumers, respectively. Total 
consumption is Q=a,q,+@q,. Let S{q;) 
be the utility derived by a type-i consumer 
from consuming the regulated firm’s good. 
To facilitate the analysis,° we make the 
assumptions that S,(q)=S(q) and S,(q)= 
6S(q) with 6>1. 

The technology of the regulated firm is 
defined by its cost function: 


(1) - C=(B-e)0 


where B is an intrinsic cost, parameter 
known only to the firm, and e is an effort 
level which has disutility y(e) (with '> 0, 
wv” > 0, y” > 0)’ for the firm’s manager. We 
could add a (known) fixed cost in (1) with- 
out any change for our analysis. It is com- 
mon knowledge that 6 =[8, B]. 

The regulator observes the firm’s outputs 
q, and q,, cost C, and revenue R(q,,q,). 
We make the accounting convention that 
the regulator receives R(q,,q,), reimburses 
cost C, and pays a net transfer ¢ to the firm. 


natural monopoly. As in this paper, the possibility of 
substitution for consumers introduces discontinuities in 
the regulated firm’s control problem. Caillaud focuses 
on the role of correlation between the technologies of 
the regulated firm and the competitive fringe, rather 
than on the issue of second-degree price discrimination 
(in Caillaud’s model, arbitrage constrains firms to prac- 
tice linear pricing). Michael Einhorn (1987) provides 
an analysis of bypass in a model without asymmetric 
information about the firm’s technology. He obtains 
the result that marginal price may be below marginal 
cost for some consumers in the absence of incentive 
constraints for consumers. In our analysis, price may be 
below marginal cost because the regulated firm cannot 
identify high-valuation consumers and use third-dezree 
price discrimination. Einhorn (1987) finds that cus- 
tomers make efficient choices when deciding to use the 
bypass technology. We will show, on the contrary, that 
bypass is used too often, compared to a situation in 
which the regulator could monitor the access to bypass. 

This assumption enables us to have a convex pro- 
gram in the no-bypass regimes so that necessary and 
sufficient conditions for characterizing the optima- so- 
lution are available. 

74" > 0 makes stochastic incentive schemes nonop- 
timal. 
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Regulation must satisfy the individual ra- 
tionality (IR) constraint of the firm:® 


(2) U=t-y(e)=0 veep, Al. 
In addition, there exists an alternative 
(bypass) technology to which a consumer 
has access if he pays a fixed cost f. Then, 
the constant marginal cost of this alterna- 
tive technology is d.”'° 

We make an assumption that ensures that 
type-1 consumers (the low-valuation con- 
sumers) never find it advantageous to use 
the bypass technology. Let 


S#= max | S(q) — f — dq] 


denote the utility level of a type-1 consumer 


using the bypass technology. We postulate 


that the bypass alternative is never an opti- 
mal choice for low-demand consumers: S}* 
<0. We assume that a,/a, is not too 
small, so that it is always optimal for the 
regulated firm to serve the low-demand cus- 
tomers. However, in some circumstances, 
the type-2 (high-valuation) consumers may 
want to quit the regulated firm and use the 
bypass technology. 

To fit the examples given in the intro- 
duction, we assume that individual con- 
sumption of the regulated product can be 
monitored by the regulated firm so that 
nonlinear pricing is feasible. Let T; denote 
the payment.by type-i consumers for quan- 
tity q; and let {(7,, 41), (T2, q>)} be a nonlin- 
ear schedule. Then, R(q,,q>) = @,T, + a5T3. 

The regulator is utilitarian and wishes to 
maximize the sum of consumers’ welfares 


We assume implicitly that it is never optimal to 
shut down the regulated firm. 

Note that our analysis would be unchanged if the 
regulated firm could also offer the alternative technol- 
ogy and Bertrand competition with zero profits on the 
high- -demand consumers took place. 

Our approach thus differs from that of the con- 
testability literature (William Baumol et al., 1982) in 
several respects. First, we allow transfers between the 
regulator and the firm. Second, the regulated firm and 
its competitors (here, the bypass producers) do not 
face the same cost functions. Third, the regulated 
firm’s technology is unknown to the regulator. 


VOL. 80 NO. 5 


and the firm’s utility level, taking into ac- 
count that the social cost of public funds is 
1+A>1 (because of distortive taxation). 
When all consumers consume the regulated 
firm’s good, social welfare is 


(3) W=a,S(q;y) + a,6S(q2) 
—(a,T, + a,T,)—-(1+A) 
X(C+t-—a,7,-a,T,)+U 
= a,58(q,) + @,0S(q.)-(1+A) 
x [(B — e)(a49, + 2242) + H(e)] 
— AU + A(a,T, + aT). 


When type-2 consumers use the bypass 
technology, their utility level is S# = 
_ max ,(05(q)— f — dq), and social welfare is 


(4) W’=a,S(q,)+a,S# —(14A) 
x [(B -e)a14,+ ¥(e)] 
— AU + Aa,T;. 


The transfers T, and T, will be seen to 
be linear combinations of S(q,) and S(q,); 
so if we denote S(qg,)=s, and S(q,)=5,, 
the concavity of the objective function W 
relies on the concavity in (s,,5,,e) of 
P(s,, 55, e) oS, (1+ A{(B — ella, f(s,) + 
a,f(s,)]+ w(e)} where ¢ is the inverse func- 
tion of S. Throughout the paper we assume 
that T(-) is strictly concave in the relevant 
domain of (s,,5,e).! This will ensure a 
unique solution for (q;,q2, e) in each regime 
considered in the paper. 

The regulator does not observe e and has 
incomplete information about £. He has a 
prior on [8,8] represented by the cumula- 
tive distribution function F() which satis- 
fies the monotone hazard-rate property 
(d/dB\F/f)>0.% The first step of our 


1HIf we assume that w'(e) >, as e> @ and p— g> 
0 for all 8, concavity is obtained if S” is large enough. 

“This assumption simplifies the analysis by prevent- 
ing “bunching phenomena” (bunching occurs when the 
regulator induces different types 8 to choose mee same 
allocation {£, c, q4, 42). 
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analysis is to characterize the regulator’s 
optimal pricing rule and optimal incentive 
schemes under incomplete information. 


Il. Optimal Pricing Rule and Optimal 
Incentive Schemes 


Our first observation is that incentive 
compatibility implies that the marginal cost 
c(B) is a nondecreasing function of the in- 
trinsic cost parameter B. Moreover, the par- 
ticular cost function we postulated for the 
regulated firm makes the rent of asymmetric 
information U() that must be given up to a 
firm of type 8 a function only of the marginal 
cost schedule c(-) being implemented. 
These facts, which follow from classical ar- 
guments in incentive theory, are proved in 
Appendix 1. The intuition as to why the 
firm’s rent depends only on the marginal 
cost schedule c(-) can be grasped from the 
cost function (1). The regulator observes C 
and Q and, therefore, knows the realized 
marginal cost. Thus, the scope for the firm 
to transfer good technological conditions 
(low B) into-a slack rent (low e) are deter- 
mined by the marginal cost schedule. The 
intuition as to why c is nondecreasing in B 
is that it is less costly for a low-ß firm to 
produce at a low cost. Hence, if type B 
prefers to produce at cost c rather than cost 
ĉ <c, type B’> gB cannot prefer to produce 
at cost € rather than cost c. 

These two facts justify the two-step pro- 
cedure that we use to characterize the opti- 
mal solution. For a given value of 6 and a 
given average (or marginal) cost c, social 
welfare is 


(5) Wc, qis åz: Tis Ta, B) 
a V(c, CEE P I S aa Z(B,c) 


where 


Z=-(1+A)}4(B-—c)—AU(B) 
V =&S(q,) + @29S( 42) 
—(1+A)c(a,q,+ @2q2) 


+ AC at, + aT) 
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when the bypass technology is not used by 
type-2 consumers, and 


V =a,S(q,) + @,S# —(14+ A)ca,q, + Aa,T, 


when it is. 

The constraints imposed on the regu- 
lator’s maximization program are of two 
kinds: the firm’s incentive constraints 


(6) U(B)=—¥(B-c(B)) and 
é(B) 20 


(7) U(B)=0 


almost everywhere (see Appendix 1) and 
the consumers’ incentive constraints coming 
from the fact that the firm cannot distin- 
guish ex ante between the two types of 
consumers. Type-1 consumers’ individual 
rationality constraint (IR ,) is 


(8) S(4,)-T, 20. 


Type-2 consumers must obtain a utility level 
as large as in the bypass alternative to re- 
main with the regulated firm; therefore, 
their individual rationality constraint (2) 
is 


(9) OS(q2)—T, > SŽ. 


When the type-2 consumers use the bypass 
technology, IR, becomes irrelevant. 

Self-selection by consumers also imposes 
the following incentive constraints (IC, and 
IC,, respectively): 


(10) S(q,)-T, > S(a2) -T, 
(11) @S(q,)—T, = OS(q,) —T). 


In view of the decomposition obtained in 
(5), the optimization of expected social wel- 
fare under constraints (6)-(11) can be de- 
composed into 1) a maximization of V sub- 
ject to (8)-(11) with respect to q}, 4, Tp 
and T, for each value of c (which deter- 
mines optimal pricing) and 2) a maximiza- 
tion of the expected value of social welfare 
with respect to c(-) for those values of q,, 
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TABLE 1—THE Sıx POSSIBLE REGIMES 


Regime Binding Constraints? Bypass? 
1 (IC,) dR D No 
2 AC} GR) dR,) No 
3 (IR;) R3 No 
4 (IR,) (AR) dc,) No 
5 (IR 3) (IC 1 ) No 
6 (IR,) Yes 


"As given in equations (8)-(11). 


Go, Ti, and T, under constraints (6) and (7). 

Let us now consider the first maximiza- 
tion. Straightforward arguments show that 
constraints (8)—(11) define six possible 
regimes characterized by the binding con- 
straints in (8)-(11) and the occurrence of 
the bypass regime (see Appendix 2 for a 
proof). These regimes are given in Table 1. 

In each of the six regimes, optimal pricing 
is determined by 


max|V(c, Gis d2, Ti: T3 )| 


subject to the binding constraints of re- 
gime i. 

Let V'(c) be the optimal value of this 
program in regime i and g;(c) the consump- 
tion of type j in regime i, i=1,...,6. The 
results of these maximizations are gathered 
in the next proposition. Let p, = S’(q,) and 
P- = 0S'(q,) denote the marginal prices for 
the type-1 and type-2 consumers, respec- 
tively. 


PROPOSITION 1: In regimes 1 and 2, p, 
>c and p, =c with dp, /dc >Q in regime 1 
and dp, /dc=0 in regime 2. In regime 3, 
P, = p2=c. In regimes 4 and 5, p,=c and 
pa <c with dp, /dc=0 in regime 4 and 
dp, / dc >Q in regime 5. In regime 6, p,=c. 


see Appendix 3 for a proof of this proposi- 
tion and for the formulas defining optimal 
prices. In Proposition 2 (below), we show 
that regimes are ordered from 1 through 6 
as c [or equivalently B from (6)] increases. 
For a very efficient regulated firm (very 
low 8), the net surplus obtained by high-val- 
uation consumers is strictly higher than what 
they could obtain with the bypass, even when 
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the firm uses optima! second-degree price 
discrimination. The classical “no distortion 


at the top” result amounts to the equality of. 


marginal price and marginal cost for the 
high-valuation consumers (regime 1), For 
low-valuation consumers the marginal price 
exceeds marginal cost to prevent high-val- 


uation consumers from buying the low-val-. 


uation buyers’ bundle (because the shadow 
cost of public funds is positive, the regu- 
lated firm behaves qualitatively like a 
monopolist and thus tries to limit the rent 
enjoyed by high-valuation consumers). As 
the efficiency of the regulated firm de- 
creases, the bypass constraint becomes 
binding. The allocation is then distorted in 
several steps. The payment T, that can be 
obtained from the high-valuation consumers 
must be limited, which relaxes IC, and thus 
allows the regulator to bring the low- 
valuation consumers’ marginal price closer 
to marginal cost (regime 2). As ß still in- 
creases, IC, becomes nonbinding, and the 
regulator equates marginal cost and 
marginal price for both types (regime 3); but 
as B still increases, the limit put on the 
payment made by type-2 consumers makes 
_type-1 consumers’ incentive constraint (IC) 
binding. They wish to take the contract of- 
fered to type 2. To prevent that, consump- 
tion of type-2 consumers is increased be- 
yond the first best level by lowering the 
marginal price below marginal cost; this re- 
laxes IC,, because type-2 consumers have a 
higher marginal utility for the good (regime 

4).'° Finally, the payment made by the type-1 
consumers is lowered to satisfy their incen- 
tive constraint, which leaves them with a 
surplus. When this regime is obtained, we 
have the interesting result that, because of 
(unmonitored) bypass, optimal regulation 
may require leaving a rent to low-valuation 


For a given marginal cost c, the consumptions of 
low-demand customers: in regime 2 (q?) and high- 
demand customers in regime 4 (q3) are equal. This is 
due to the facts that @S(q?)— T, = SF (from IC, and 
IR») ; and T, = S(q?) (from IR,) on the one hand, and 
0S(q})— T,= S# (from IR.) and T, = S(q3) (from IC, 
ane IR») on the other hand yield the same solution: 

= q3 = S—'(S¥/(6 —1)). (This is not an artifact of 
the particular surplus functions we chose.) 
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consumers to be able to offer to high-val- 
uation consumers a deal good enough that 
they do not use the bypass (regime 5). Last, 
in the bypass regime (regime 6), the firm 
serves a single category of consumers and 
thus imposes no distortion in consumption. 
Note that, while the optimal regime de- 
pends on £, all the conclusions about pric- 
ing in a given regime obtained in Proposi- 
tion 1 hold for any B Q.e., are due only to 
the asymmetry of information with respect 
to consumers’ tastes). 

We next observe that the bypass regime 
can only occur for an interval [8*, 8] of 
values of ß, and more generally the 
regimes (when they exist) are ordered from 
1 to 6 when B increases. 


PROPOSITION 2: (i) There exists B* € 
[ : „Bl, such that the bypass regime occurs if 


S Piv Bo = B, Bs= 
that regime i prevails if and only if BE 


[Bii Bl. 


See Appendix 4 for a proof. 

The intuition for (i) is simply that, if the 
regulated firm becomes more inefficient, 
letting the high-demand customers use the 
bypass becomes more attractive. The idea 
behind the proof of (i) is that, given the 
concavity of our problem, variables that sat- 
isfy the first-order conditions for max- 
imizing expected social welfare and yield 
continuous control variables form a solu- 
tion. Moving from regime 1 to regime 5 
yields a continuous solution on [8, B*]. Fig- 
ures 1 and 2 summarize our findings up to 
now. 

We next look for a solution to the maxi- 
mization over [8, B*] with the ordering of 
regimes 1, 2, 3, 4, and 5 along the B-axis. 
Let us call Wel BY) the function obtain- 
ed by piecing together the functions 

Vie)... Vc) on the intervals [B, By], 


[B1, Boh. .[B4,B*), with B<Bi <5 B2 < P3 


14This observation is of_interest only when f is 
sufficiently large. For small 8, B* = 8 and bypass never 
occurs (although it may be binding). 
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< B, < B* <B (all regimes need not exist). 
That is, V(c) = max;eu,.. „i7 (o). 

In view of Proposition 2, the program for 
the overall maximization of expected wel- 
fare can be written: 


(12) max f" [F —(1+A) 
_ Xu(B—c(B)) -AU(B)| dF (B) 
+ [PLP (8B) C+ a) 


x #(B—c8(B)) -AU(B)| 4F(B) 
subject to 
U(B) =~ '(B-(B)) 
U(B)=0 
¢(B) 20 
(this last sonst will be ignored in a first 


step and checked ex post). 
Fixing Ț p* and U(p*)=U= feu (B- 


c°"(B)) dB, we first look for a Slon to the - 


maximization over [ 8, 8*] using the fact that 
the ordering of regimes is 1,2,3,4,5 along 
the B-axis. We have 


4 
(13) max E f PCCB) dF (B) 
- f 1+ AUB- eCe) 


+ AU(B)| dF (B) 


subject to 
U(B) =- 4'(B-c(B)) 
U(p*) =U. 


In each regime, using dVi/de=—(+A) 
(aqi + @2q3), and letting gj(c) denote the 
optimal consumption of type-j consumers in 
regime i when marginal cost is c (see 
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Proposition 1), the Pontryagin principle 
yields 


(14) o'(B-c"(B)) 
= agi (c*(B)) + a20} (c"(B)) 


{eal 
1+A J| f(B) 


as in Laffont and Tirole (1986). Equation 
(14) has a unique solution in each regime 
from our concavity assumptions. 

The characterizations of the (possibly de- 
generate) intervals defining regimes are ob- 
tained by maximizing (13) with respect to 


B,, Bz, Bs, and B,: 


(15) Vi(c"(B,)) =Vit(ci**(B,)) 
for i € {1,2,3,4}. 





We “(B)) 


Furthermore, the derivative of the value of 
the program (13) with respect to U = U(p*) 
is equal to AF(B*) (a unit increase in U 
translates into a unit increase in the rent of 
all types that are more efficient than p*, 
which. has social cost A). 

Still fixing B*, we next look for a solution 
to 


(16) max {[V%(()) —(1+ à) 
x (B—c(B)) —AU(B)| dF (B) 
— AF( B*)U(B*) 
with the constraints that 
U(B) = —#'(B-c(B)) 
U(B)>0. 


Since dV$/de = —(1+ A)qS(c), the Pon- 
tryagin principle gives 


(17) ¥'(B-e%(B)) = a,a7(c*(B)) 
A 
1+A\ f(B) 


x ¥"(B-c%(B)) | 
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with 


U(B*) = fy Pu- (A) dÉ. 


Using the expression for qf given in 
Proposition 1, our concavity assumptions of 
Section I imply that (17) has a unique solu- 
tion {q? (B), c*"(B)}. 

The rents are obtained by backward in- 
duction from regime 6 to regime 1: 


(18) U(B)= [Pv B— (8) a6 
for 8 €[B*,B] 
U(B) = f P PCB- c (EY) dB + U(B*) 


for B € [B4 B*] 


U(B) = fi Pyg- (6) dB +U(B,) 


for B € [B, b1]. 


It remains to optimize with respect to 8*. 
For A small enough, the problem is strictly 
concave in B* (Appendix 5). Assuming that 
regime i €{1,...,5} is to the left of B*, we 
obtain the first-order equation (using the 
continuity of the rent) 


(19) Vi(ci( B*))— (1+ A)y(B* - c'(B*)) 
= V%(c5( p*)) —(1+ A) w(B* — c°(p*)). 


(Figures 1 and 2 depict the case in which all 
five regimes exist to the left of B*.) Since 
the objective function is concave in 8* for 
all i, there exists a unique solution. Which 
regime i prevails before bypass occurs de- 
pends on the values of the parameters {see 
Appendix 6). 

Remark on Two-Part Tariffs: A similar analy- 
sis can be performed when the firm is con- 
strained to charging a two-part tariff: T(q) 
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=a + bg. The number of relevant regimes 
is lower, as there are no-longer incentive 
constraints for the two types of customers 
(each type of customer chooses his pre- 
ferred point on the straight-line tariff). So 
IR, alone, IR, alone, or both individual 
rationality constraints’® may be binding. 
Results similar to those for nonlinear pric- 
ing can then be obtained. For instance, if 
only the bypass constraint (IR.,) is binding 
(which can be shown to be optimal for some 
values of the parameters), the optimal slope 
of the two-part tariff is given by 


= a,[q2(b) — q,(b)] —— 1+A0 


Because q,(b)>q,(b) and dq; /db <0 for 
all b and i, b< B -— e; the regulated firm’s 
marginal price is below marginal cost. 


Finally, we will show that. the optimal 
transfer schedule can be implemented 
through a menu of linear contracts. Since 
there is no atom ai £”*, the £* firm is 
indifferent between the bypass regime and 
the no-bypass regime. From Laffont and 
Tirole (1986), we know that in the bypass 
and no-bypass regions the nonlinear trans- 
fer schedule f(c) is convex.” It is therefore 
also globally convex across regimes (see Fig. 


Therefore, this schedule can be replaced 
by a menu of linear contracts with slope 


‘We maintain our assumption that the regulator 
chooses an incentive scheme (c) for the firm. As 
before, the firm’s revenue is not necessarily raised 
entirely by the direct charges to final consumers. 

In the latter case, a and b, and therefore q,(b) 
and q,(b), are completely determined by the two con- 
straints. 


The convexity of t(c) in [c(B), c(8*)] results direct- 
ly from Laffont and Tirole (1986). The proof must be 
slightly extended in [8*, 8], because of the term 
AF(B*)UCB*) in program ho. which represents the 
shadow cost of the rent at the left of the interval; but 
the shadow cost and, therefore, the allocation on [8*, 8] 
are the same as if regime 6 obtained on [£, B]. There- 
fore, again from Laffont and Tirole (1986), t(c) is 
convex on [c(B*), c8). 
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t(c) 


t- y (B*-c}= CONSTANT 
~ 


cA) E- 


FIGURE 3. IMPLEMENTATION THROUGH LINEAR SCHEMES (c = MARGINAL 
Cost; NB = No Bypass; B = Bypass) 


w'(B(c)—c), where B(c) is the type that 
produces at marginal cost c. The interest of 
this result is that an additive noise can be 
added in the cost function (1) without any 
effect on our results. 


Ill. Bypass and Cream Skimming 


In this section, we compare the optimal 
regulatory mechanism characterized in Sec- 
tion IJ with the one obtained when, in addi- 
tion, the regulator can monitor the access to 
the bypass technology. We assume that the 
regulator’s new instrument is the possibility 
of prohibiting bypass (that is, bypass is only 
partially regulated). The regulator’s objec- 
tive function is unchanged, but the con- 
straints are now 


(20) @5(4,(8))-T(B) 

> 0S(4,(B))-T,(B) 
(21) S(4,(8))-T\B) 

> S(q,(B))—T2(B) 


(22) S(q4,(B))—T,(B) = 0 


95(42(8)) -T,(B) = 0 


when the bypass technology is not used, and 
only (22) when the bypass technology is 
used by high-valuation consumers. 

Then, as in traditional adverse-selection 
problems, only the high-valuation incentive 
constraint and the low-valuation individual 
rationality constraint are binding. For B < 
B°", we are therefore in regime 1 of Section 
Il. For B > B° we are in regime 6 of Sec- 
tion IJ. The pricing policy is illustrated in 
Figure 4. Optimization with respect to the 
value B°, at which bypass starts being al- 
lowed, yields 


(23) 


(24) Vi(cl(B*)) 
—(1+ A)¥(B* — c'(B*)) 
=V%(c8(B")) 
—(1+A)¥(B° —¢9(B")) 
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BYPASS 


Figure 4. Price Prorttes WHEN Bypass Is MonITrorED (Soum 
Line = MARGINAL Cost) 


(the superscript c identifies this case of 
“control” of access to bypass). 


PROPOSITION 3: B% >:B*; there is exces- 
sive bypass when. bypass cannot be prohibited. 


PROOF: 
Suppose that ge < B*. From the defini- 
tion of B*, we ae 


(25) a [PE (+a) | 
x y( B - E(B)) — - AŪCB)] FCB) a6 
- AF(B™)U(B") 
= fe [Ve (4a) 
x y(B -c (B)) 
- AU*(B)] f(B) 4p 
— AF(B°)U%(B*) 


where 


(26) UB) = f °u'(B-c%(B)) dB 


(27) ÖP) = f wl- Elh) dB 


B 
+U°(B*) 


and where é and V refer to cê and V! for 
the optimal regime i €{1,...,5} (as in Sec- 
tion II). In words, when bypass cannot be 
prohibited, the regulator could have used 
regime 6 on [8° B*] but elected not to do 
so. Now (25) is satisfied a fortiori if VE(B)) 
is replaced by V‘(é(B)) because V4(c)> 
V(c) for all c (there are fewer constraints 
when bypass can be prohibited). This means 
that when bypass can be prohibited, the 
regulator would be better off prohibiting 
bypass on [B*,B*] even if he chose the 
suboptimal function é(-) instead of c'(-) on 
that interval, a contradiction. 


VOL. 80 NO. 5 


The intuition for Proposition 3 is that 
when the regulated firm supplies high- 
demand consumers, the threat of bypass 
_ imposes an additional constraint that re- 
duces welfare relative to the one (V+) that 
can be obtained when bypass is prohibited. 
Because the prohibition of bypass elimi- 
nates this constraint and raises welfare in 
the no-bypass region, it becomes optimal to 
extend the latter region. 

Next we compare the firm’s rents U(B) 
and U(8) when bypass cannot and can be 
prohibited. 


PROPOSITION 4: There exists By = B such 
that 


U"(B)<U(B) for B<Bo, 
U°(B)>U(B) for By<B<B°, 


U°(B)=U(B) for B= ph. 
Furthermore, By = Ba if Bo > B- 


PROOF: 
The rent in both cases is given by 


f£u'(e(B)) dB, where 


(28) ¥'(e(B))=Q(8)- > 


FB)... 


xg” ¥"(e(B)). 


Next we note that O'(c) > O(c) > Q%(c) for 
i€{1,...,5} and for all c.!® This results from 
Proposition 1 (or Fig. 1); in regimes 2 and 3, 
q, is the same as in regime 1 (for the given 
marginal cost c), while q, is higher. In 
regimes 4 and 5, both q, and q, are higher 
than in regime 1. 


18Note the parallel between bypass and the “shut- 
down option” in traditional adverse-selection: models 
(e.g, David Baron and Roger Myerson, 1982). In the 
shutdown regions, the firm’s rent does not increase 
with its efficiency. Here, it increases at a slower rate in 
the bypass region (Q' > 0°), 
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Because in all regimes 


A 
c'(B)) = O'(c'(p)) a 


(29) ¥'(B- are 


x p"(B—-c'(B)) 


and because the objective function is con- 
cave, we get c(B)<c'(f) for all ie 
{1,...,5}. Hence Q(B) = O'(c(B)) = 
Q'(c'(B)) = O'(c'(B)) = Q(B). 

Equation (28) thus implies ‘that w’(e(B)) 
(which is also the slope of the incentive 
scheme at 8) is higher on [f,,8*] and 
smaller on [8*, B° ] when bypass cannot be 
prohibited. The slopes coincide on [B, B,] 
and on [8%, 8] (see Fig. 5). Proposition 4 
follows immediately. 


The intuition for Proposition 4 is as fol- 
lows. When bypass can be prohibited, the 
bypass region is smaller. Thus, in [8*, B®], 
where bypass is now avoided, the regulated 
firm supplies the high-demand customers, 
which raises demand and makes marginal 
cost reduction more desirable. Thus, more 
incentives are given to the firm to reduce 
costs, which raises the rent of firms in 
[8*, B]. However, in former regimes 2-5 
Gf such regimes exist), keeping the high-de- 
mand customers no longer requires the high 
outputs characterized in Proposition 1, as 
bypass can be prohibited. The regulator re- 
duces the incentives for cost reduction and 
thus the firm’s rent. That is, as B decreases . 
under B*, U°(B)— UCB) decreases and may 
become negative. The firm need not gain 
from the prohibition of bypass, because the 
threat of bypass was a “good excuse” for 
low prices, high outputs, and thus a high 
rent. 

Let us next consider the effect of a change 
of information on the extent of bypass when 
access to bypass is controlled. We index the 
distribution F(8,v) and the inverse of the 
hazard rate. H(B,v) = F(B,v)/f(B,v) by a 
parameter v. We assume that. H, < 0 (that 
is, the hazard rate increases with v). For 
some families of distributions, an increase 
in’ the hazard rate corresponds to an im- 
provement in information. For instance, for 
a uniform distribution on [8, 8], H = B — B, 
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SLOPE OF THE INCENTIVE SCHEME (W 'te)) 


BYPASS 
CANNOT 

BE 
PROHIBITED 







CAN 
BE 
PROHIBITED 







O ' 
B fı , B* Bo* B 


FiGuRE 5. SLOPE OF THE INCENTIVE SCHEME 


so that when # increases, the support of obtain 
the uniform distribution shrinks (in this ex- 





ample, v = 8). Differentiating (24) we get det | À | 

GY a= R 
dpe B À | H(B,v) w"(et) 
wrt myt Se 
a ONE) Se (6 aE 
x (5e) 

d | (32) O cat | 

— r [v(e%(6"))] dv PLI+A 


i | y” les) > 

Intuitively, an increase of v, creates more oe: A daa dQ 
bypass if, at B°’, the rate of increase of the y"(e>)+ ERU (e°)H + a 
(costly) rent #’ is less affected in the no- 
bypass regime than in the bypass regime. 

Differentiating. the first-order conditions PROPOSITION 5: Assume that p” is con- 
defining quantities and effort [see (28) as stant and that either X is small or demand 
well as regimes 1 and 6 in Appendix 3], we functions are concave. Then an increase in v 


6 
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that increases the hazard rate f / F around Bo 
increases B®.” 


PROOF: 

We know from Appendix 4 that Q! > Q°. 
Hence, e! > et, from (14) and the associated 
second-order condition. Next, A small or 
concave demand functions imply that 
dQ’ /dc < dQ°/dc <0 (see the expressions 
in Appendix 3). Thus (12), (13), and (14) 
imply that dB“ /dv > 0. 


The intuition for Proposition 5 is that, as 
f/F increases, the concern with the firm’s 
rent in the no-bypass region (which has 
probability F) decreases relative to the con- 
cern for the distortion at B° (which consists 
in imposing bypass to reduce the rent of 
better types and has probability f). The 
need for assumptions in Proposition 5 comes 
from the fact that output, and not only the 
firm’s rent and cost, matters. Without such 
assumptions the result might be reversed. 

For the case in which bypass cannot be 
prohibited, the result is similar and true 
even more often, since the slope |dQ / dce] is 
higher in that case (at least for regimes 3, 4, 
and 5). 

Finally, let us note that, at least for small 
A, bypass increases with A, both with and 
without control of access to bypass. Intu- 
itively, more-costly transfers make bypass 
more desirable by increasing the incentive 
costs of the regulated firm. This holds at 
least for small A; for large A, the social gain 
stemming from the firm’s revenue AR(q) 
may upset this result. We now prove this 
result in the case where bypass is con- 
trolled, but a similar reasoning holds in the 
other case. 


PROPOSITION 6: Bypass increases with À, 
for à small. 


PROOF: 
Let SC(B)=(B — e)Q +(e) denote to- 
tal social cost, and let R(B)=a,T,+a,T, 


The probability of bypass is equal to 1— F (B°",v). 
The total effect of an increase in p is in general 
ambiguous, as F decreases with v but increases with 
B7, which itself increases with v. 
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denote total revenue. Differentiating (24) 
gives 


dp% 
33). ——<= 
(33) = 7 


_ [R'(67) ~sc*(8") J -LR°(8") -sce 
a'(B™)- Q (e°) 


At A=0, 
R'(B")— SC'(B") =W*(B°) - (8-1) 
xa,S(9gi(B™)) 
R°(B*)— SC°(B") =W°(B™) + a, SF. 


As sozial welfare is continuous at Bp, the 
result follows since O'(B°") > 0°(B~). 


IV. Conclusion 


Our main economic findings may be sum- 
marized as follows: 


(a) Asymmetric information between the 
regulator and the firm raises the actual 
cost of the regulated firm and increases 
the probability of bypass.” 

(b) Bypass should be fought (i.e., high- 
demand customers should be retained) 
when the regulated firm is efficient. An 
efficient firm is screened through its 
choice of a steep (high-slope) incentive 
scheme. Hence, the slope of the regu- 
lated firm’s incentive scheme is posi- 
tively correlated with its success in 
fighting bypass. 

(c) It may be optimal to charge marginal 
prices below marginal cost for high- 
demand customers. Because these cus- 
tomers must be granted advantageous 
terms to be retained, low-demand cus- 
tomers must be dissuaded from buying 
the high-demand customers’ bundle by 
charging a high fixed fee and a low 
marginal price. 


20The role of bypass in providing discipline for the 
regulated firm is similar to the role played by entry and 
auditing in Joel Demski et al. (1987) and in David 
Scharfstein (1988). 
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(d) Low-demand customers are not neces- 
sarily hurt by the threat of bypass. In 
our model, they may enjoy a positive net 
consumer surplus when the regulated 
firm is constrained by bypass but does 
not let bypass operate, while they never 
do when bypass is controlled by the 
regulator. They indirectly benefit from 
the high-demand customers’ being of- 
fered advantageous terms. With more 
than two customer types, it can be shown 
that some customers may be hurt by 
bypass.” Our point here is that the ef- 
fect of bypass on low-demand customers 
is ambiguous; the “skimmed milk” need 
not be made worse off by bypass, con- 
trary to conventional wisdom. 

(e) There is excessive bypass if bypass can- 
not be controlled by the regulator. By- 
pass interferes with optimal second- 
degree price discrimination. 

(f) A mediocre regulated firm is hurt by 
bypass. An efficient regulated firm may 
benefit from the threat of bypass, be- 
cause it can use it to vindicate high 
levels of production. 


Caution should be exercised in particular 
when applying the last two conclusions. We 
compared two regulatory institutions, in 
which the regulator has or does not have 
the authority to prohibit the competitive 
technology. The analysis is restrictive for 
two (related) reasons. First, in principle 
there exists a vast array of government in- 
terventions with regard to competition, 
which include direct regulation and subsi- 
dies or taxation. Second, we did not make 
explicit the reasons why the government has 
limited authority on the competitive sector. 
Presumably, this limitation in scope of au- 
thority stems from the costs of regulation or 
from the fear that the extension of the 
scope of regulatory authority from the dom- 
inant firm to the whole industry would re- 
sult in producer protection. Despite these 


lFor instance, with three types, if type-3 (high- 
demand) customers use the bypass technology, the 
costs of producing for the remaining two types increase 
because of returns to scale, which may reduce the net 
consumer surplus of type-2 customers. 
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caveats, we believe that a normative analysis 
such as the one performed here is a first 
step toward understanding the policy 
trade-offs with regard to bypass and cream 
skimming. ` 

Caution should also be exercised in the 
study of particular industries. While we ana- 
lyzed optimal regulation, transfers from the 
regulators to the firms are sometimes legally 
prohibited (e.g., in the telecommunications 
and electricity industries in the United 
States). A regulated firm’s cost is then en- 
tirely paid by direct charges to consumers. 
Some of our conclusions may be affected by 
the impossibility of transfers; see Laffont 
and Tirole (1990c) for an attempt at ex- 
plaining the prohibition of transfers. For 
instance, low-demand consumers are more 
likely to be hurt by bypass in the presence 
of returns to scale when consumers must 
pay the firm’s full cost. 

Last, several additional issues could be 
addressed within our normative framework. 
For instance, if the regulated firm were to 
choose ex ante among technologies, would it 
prefer a high-investment low-marginal-cost 
technology to fight bypass or a low-invest- 
ment high-marginal-cost one to focus on 
low-demand customers, and how would this 
affect high- and low-demand customers? 
Also, we have ignored some potential bene- 
fits of competition. If the bypass technology 
is similar to the regulated firm’s, the regula- 
tor can use bypass as a yardstick to monitor 
the firm further. Moreover, it might be the 
case that bypass enhances product variety 
by making available goods that cannot be 
produced by the regulated firm. These and 
other -questions are left open for future 
research. 


APPENDIX 1 


From the revelation principle, a regula- 
tory scheme can be represented by a revela- 
tion mechanism which specifies for each 
announcement of the cost characteristic £ 
levels of production q,(B) for a type-1 con- 
sumer and q,() for a type-2 consumer, a 
total cost target C(), and a net transfer 
received by the firm from the regulator ¢(p). 
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Incentive compatibility at p. and 8’ re- 
quires | 


(Al) UB,8) 


E C(B) 
=1(B) We @19;(B) + 24B) 
= U(B, p’) 
C(L’) 
= t(B’) o(p ag (B)+ azq (B>) 
(A2) U(B',B') 
? t E BP 
=P) v(¢ a19,(B') + @292(B') 
> U(B', B) . 
. C(B) 
=1(B) 7G &q4 (B) + @2q>(B) 


Adding (A1) and (A2) and denoting c(B)= 
C(B)/la,q(8) + a29q,(f)] (the average 
cost), we get 


(A3) w(B'—c(B))— #(B- c(B)) 
= w(B'— c(B')) — b(B-c(B’)). 


Consider the function: (x)= w#(p’—- x)- 
WCB — x). For B < B’, B(x) <0 since y” > 0. 
Therefore, (A3) implies c(B) <c(p’). The 
revelation mechanism can alternatively be 
represented by the functions q,(-), qC), 
c(-), and t(-). 

Let U(B) be the rent captured by a firm 
of type g. Incentive compatibility implies 
that UC) is continuous and nonincreasing, 
since a type-8 firm with 8 < 8’ can always 
mimic a type-f’ firm at smaller cost. U(8) is 
therefore almost everywhere differentiable. 
Similarly, c(8) is almost everywhere differ- 
entiable. U(B) exists almost everywhere and 


U(B) = —#'(B-c(B)) 


almost everywhere, and the rent of type B is 


UCB) = fPw'(6 - (8) 46 
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Necessary and sufficient second-order 
conditions are 


é(B)>=0° 


(see Roger Guesnerie and- Laffont, 1984). 
Since U(B) <0, the firm’s individual ratio- . 
nality constraint reduces to 


U(B)=0. 
APPENDIX 2 


In addition to the bypass regime, we have 
potentially as many regimes as combinations 
of binding constraints among (8)-(11). How- 
ever, the following lemmas cut down the 
number of cases to five. 


LEMMA 1: [If the two types of consumers are 
offered two different contracts, the two incen- 
tive constraints cannot be simultaneously 
binding. 


PROOF: 

Suppose the contrary. If the type-1 con- 
straint is binding, T, — T; = S(q,)— S(q,). If 
the type-2 incentive constraint is also bind- 
ing, we have 6(S(q,)— S(q,)) = S(q,)- 
S(q,), a contradiction unless qg,=q,; but 
then T} =T,, contradicting the fact that we 
have two distinct contracts. 


LEMMA 2: Jf the type i (=1,2) incentive 
constraint is not binding, then the type-j indi- 
vidual rationality constraint is binding. 


PROOF: 
Reduce T, if the type-j individual ratio- 
nality constraint is not binding. 


LEMMA 3: A pooling contract can never be 
optimal. 


PROOF: 

Note first that in a pooling contract (q,T) 
consumers’ incentive constraints are auto- 
matically satisfied. (i) If p,=@Sq)> 
(B—e)(1+ A), increase q, by € and the 
transfer by dT, =8S'(q)e so that type-2 
consumers remain indifferent. Since at (q4, T) 
the marginal rate of substitution between q 
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and T.is higher for type-2. consumers, this with the constraint 

new allocation is incentive compatible. This . 

raises welfare by a,[@S(q)—- (+ A) (6 -—1)S(q,) = Sž 
(B-—ele. Gi) If p,<(B-—e)1+A), py< 

(B—eX1 +A). Decrease q, by e adjusting yields 

T, so that type-1 consumers remain on the 








same indifference curve [i.e dT, = PIG 
— §(q)e]. Then, the total welfare ‘change i is Ss 
—(1+A)a,S(qe +(1 + AXB — ee which is E nn ae pe 
positive by assumption, a contradiction. : 0—1 
Combining Lemmas 1, 2, and 3 we have implying 
only the five regimes described in the text in 
addition to the bypass regime. ap, =0 
cece | de ` 
APPENDIX. 3 
PROOF OF PROPOSITION 1: For regime 3, 
The relevant transfers are deduced from the 
relevant binding constraints. For regiu, 1,’ max{a,5(q,) + @20S(q2) 
max{a,S(4,) + a20S(42) ? —(1+A)e(a,q, + @242) 
+.A(a,8(q;) + a,(6S(q2) - S#))} 
—(1+ A)c(a,q, + a4.) Ee 
Š yields 
+ Al a,S(q,) + @2(05( 42) E Pi = P2=C. 
— 68(4,) + $(41))]}} . For regime 4, 
yields max{a,S(q,) + a26S( 2). 
ee {2 }0-0- | - (1+ A)e( aq, + @242) 
Pi T+a X1 . (eng ; 
+ A(a,S(g,) + æa (0S(42)— S¥))} 
Pare í 
. with the constraint 
[letting p= = S'(q,) and p, =05' (a2)]. Note 9S(q>) — S(q>) = Sž 
dp, yields 
—~ > 0. = 
dc l pyre 
For regime 2, | o ,-7( 9 Y 
E l q2=G=S af 7 
. max{a,S(q,) + @,0S(q2) i : 
l : implying 
—(1+A)cla,g,+ aq 
( ) ( 131 2 2) dp, E r 


+ dMaS(41) + a,(0S( 42) — S#))} de 
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For regime 5, 
max{a,5(q;) + a265(42) 

—(1+A)c(a,q, + @2q>) 

+ A(a,(S(q,) — S(q2)+ 8S(g2) - SH) 

+ a2(9S(42) — SF} 


yields 


psc A [{a,\{@-1 
Pp» 1i+à = ð | 


implying 





For regime 6, 
max{a,5(q,)+a,S# —(1+A)ca,q, 
+ Aa,S(q;)} 
yields 
p=c. 


Note that since individual prices p, and p, 
are nondecreasing in all regimes, individual 
quantities g, and q, (and also aeptegate 
quantities Q) are nonincreasing in c 


APPENDIX 4 


PROOF OF PROPOSITION 2: 

(i) Let Q‘ denote aggregate production in 
regime i. Note that Q'>@Q®° for i=1,...,5 
and for any marginal cost c. From Ap- 
pendix 3, under bypass, p; =c and q2 = 0. 
For i= 3, 4, or 5, p;=c so that q, is the 
same as in regime 6; but since q, > 0 in any 
regime i, i <5, Qi > Q5. For i=1 or 2, IC, 
and IR, are strictly binding, and no bypass 
occurs. Hence, (0 —1)S(q}) = S#. In regime 
6, IR, is strictly binding, and bypass occurs. 
Hence, (8 —1)S(q°) < Sž. This implies that 
gi> os for i= 1 or 2. 
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From the envelope theorem, for any i, 
(a /deWi(c)= + Ai. Therefore, 


£ [Pi{e)-V*(e) 
= —(1+A)(Q'- 


Consequently, there exists c* €[0, +œ) such 
that regime 6 corresponds to c levels above 
c*, Since c is a nonincreasing function of B, 
foi (21), there exists an interval [8*,ß] 
(possibly degenerated) in which bypass oc- 
curs. 
Note that, at 8*, there is a discontinuity 
in total production, since in regime 6, 


S'(q,) = (B* — e). 


Q°)<0 i=1,...,5. 


A pa 
1+A | f(B*) 





w'(e) =a,q,- Jor e) 


Q° = Ad 


and in regime 5, 


S'(q,) = (B*—e) 


Ur'(e) = 044, + aq, 


F(B*) 
f(B*) 


The discontinuity in Q and e translates into 
a discontinuity in marginal cost at B*. 

(ii) By the change of variables S(q,)=s, 
and S(q,) = s2, the constraints define a con- 
vex set in the space of control variables. 
Furthermore, because yw” > 0, the function 
of the control variable c, —w(B—c), is 
concave, and the objective function is con- 
cave in control and state variables. There- 
fore, the Pontryagin conditions are suffi- 
cient and yield continuous controls on 
[B8,8*] and [B*, B]; see theorem 5 in Atle 
Seierstad and Knut Sydsaeter (1987 p. 28). 


À 
1+A 








wr. 
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By ordering the regimes according to 
1,2,3,4,5, we exhibit a continuous solution 
that satisfies the Pontryagin conditions and 
is therefore a solution. 
APPENDIX 5 

Suppose that we have regime i (i <5) at 
the left of B*. The second derivative of the 
objective function with respect to B* is 
[using dV! /de =—(1+ A)Q'‘] 


i 


de 
(A4) -01+ A)Q' g —(1+A) 


dc! 
x w'(p*—c'(B Dja- 


+ Awv'(B*—c!(B*)) . 


6 
++ aor +(1+A) 


dc® 
y'(B*- eB) A 
— ày’ (B* — c°( B*)). 
Using the fact that 
y'(B* — c'(B*)) 


l 
1+2\ F(6*) 





j-e 


(A4) becomes 
—|y'(B* -c'(B*)) 
— w'(B* - c°( B*))| 


Mo i ) 
HE 


(A5) 





de 
Lye- a)i 


| dc® 
— w"(B*— 0B") ) a 
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Since B*—c'(B*) > B*—c°(pB*) for all i, 
(A5) is negative for A=0. The objective 
function is therefore concave in B* in a 
neighborhood of A = 0. 

For A large, the result may become am- 
biguous in some cases. For example, if #” is 
constant, the second part of (A5) is also 
negative in regimes 3, 4, and 5. In regime 2, 
the sign of the second term is ambiguous 
because |dQ /dc| may be smaller in regime 
2 than in regime 6. 


APPENDIX 6 


It can be checked that each of the regimes 
is relevant for some values of the parame- 
ters. Of particular interest for our analysis 
is the possibility of existence of regime 5. To 
check this, let us construct economies for 
which regime 5 is optimal among regimes 
i € {1,...,5} for all 8 €[8, B] (and therefore 
is globally optimal for small B’s). Suppose 


that 
1 
semam (1-3) 
E€ 


where e> 1. That is, the demand functions 
have constant elasticity «. Consider a se- 
quence of economies indexed by @ in which 
6 tends to 1 (the consumers become more 
and more alike). The bypass technology has 
cost f(@)+ d(0)q, where f(-) and d(-) are 
to be determined. All other data are fixed. 
Straightforward computations show that 


— S# =(d(8))'*(6° -/{1- 3 


and 


SF OIIE -)- (8). 


Now choose d(@) converging to 0 suffi- 
ciently fast with @ so that S¥ — SË > +0, 
and choose f(@) so as to keep S# constant. 
The analysis in the text is unchanged, as S% 
is constant along the sequence and S}* is 
negative. However, p,(@) and p,(@) con- 
verge to the marginal cost (@—e) in all 
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regimes. Because the difference 65(q,)— 
S(q,) converges to 0 when @ converges to 1 
(as long as the marginal cost B — e does not 
converge to 0, which is guaranteed if £ is 
not too small and y¢& is sufficiently steep), 
in the limit S(q,)= Sž > 0. Therefore, the 
type-1 customers’ IR constraint is not bind- 
ing, which indicates that regime 5 is ob- 
tained in the limit [if bypass is prevented, 
which will be the case for an appropriate 
WC) function]. 
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Durable-Good Monopoly and Best-Price Provisions 


By Davin A. Butz* 


Best-price provisions guarantee buyers that the prices they pay are the lowest 
available. If the seller subsequently cuts price, then each previous buyer is entitled 
to a refund. A durable-good monopolist who offers certain forms of these 
provisions can construct a consistent plan yielding the same profits as rental 
agreements and contracts with explicit quantity commitments. The provisions 
require special circumstances to be practical, but they are simple and effective and 
appear in a variety of economic settings. Three applications are discussed: 
international commodity agreements, markets for electric turbogenerators, and 
markets for financial claims. (JEL 610, 611, 612) 


In a classic paper, Ronald Coase (1972) 
conjectures that a monopoly seller of an 
infinitely durable good cannot sell output at 
the static monopoly level. Once the initial 
quantity has been sold, more profits can be 
made by cutting price and increasing out- 
put. Profit opportunities end only after price 
falls to marginal cost. Without some re- 
straints, the market is saturated with the 
competitive output “... in the twinkling of 
an eye” (Coase, 1972 p. 143). 

Since monopolists do not routinely be- 
have as competitors, either real-world con- 
ditions do not mirror those assumed in 
Coase’s illustration or monopolists some- 
how commit not to behave in this manner. 
Nancy Stokey (1981), Jeremy Bulow (1982), 
and Charles Kahn (1986) show that if pro- 
duction capacity is limited or if marginal 
cost is increasing, then the monopolist’s 
problem is less severe. Lawrence Ausubel 
and Raymond Deneckere (1987, 1989) 
demonstrate the existence of equilibria other 
than the one described by Coase and show 
that potential entry may actually mitigate 
the monopolist’s problems. In short, circum- 
stances may not be as bleak as in Coase’s 
exposition. Nonetheless, the problem often 
remains in less severe form. 


*Department of Economics, University of California 
at Los Angeles, Los Angeles, CA 90024-1477. I thank 
the participants in workshops at UCLA and USC, two 
anonymous referees, Bill Gale, John Riley, and espe- 
cially Michael Waldman for helpful comments on ear- 
lier versions. All errars are my own. 
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This suggesis that monopolists commit not 
to behave competitively. Coase offers three 
possibilities. The monopolist can commit not 
to sell additional output; the good can be 
rented rather than sold; or the monopolist 
can agree to repurchase the good if ever a 
lower price is offered. . 

This paper offers an alternative similar to 
repurchase agreements.! The monopolist’s 
problem can be resolved by employing 
best-price (BP) provisions.” These guaran- 
tee that the price to be paid or received is 
the best available. If better terms are subse- 
quently negotiated in any related contract, 
then the monopolist must refund the dif- 
ference between the original price and the 
new lower price. The outcome is the same 
as if the monopolist had repurchased the 
good at the original price and then resold it 
at the lower price. BP provisions and repur- 
chase agreements therefore differ in only 
one respect: while repurchase agreements 
require the monopolist to reassume owner- 
ship of the good, BP provisions do not. 

After providing a background (Section I), 
an explanation and example (Section II) 
illustrate the mechanics of BP provisions. A 
discrete-time model with demand uncer- 
tainty is outlined in Section II. The propo- 


‘The similarity between best-price provisions and 
repurchase agreements is also discussed by Ivan Png 
(1987). 

*Cooper (1984) suggests best-price provisions to re- 
solve the durable-good monopoly problem. 
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sitions in Section IV demonstrate the role 
BP provisions play in resolving the 
monopolist’s problem with dynamic incon- 
sistency. Section V discusses the results in a 
continuous-time framework. Section VI lists 
the provisions’ advantages and disadvan- 
tages, Section VII provides some applica- 
tions, and a conclusion follows. 


I. Background on Best-Price Provisions 


Although best-price provisions are perva- 
sive in many economic contexts, their scope 
is typically restricted. Retailers, for exam- 
ple, often extend the provision only to the 
same brand name and model and limit it to 
specific time periods and geographic areas. 
“Three-party” BP provisions (or “meet- 
the-competition” clauses) guarantee the 
lowest price offered by any seller of the 
good; “two-party” versions apply only to the 
lowest price offered by the seller involved in 
the original transaction. 

International commodity agreements have 
employed best-tariff terms, known as 
“most-favored-nation” (MFN) provisions, 
for over three centuries. By offering such 
provisions, a country promises each trading 
partner access to its domestic markets at 
tariff rates that are no higher than those 
offered by that country to any other trading 
partner. The consensus in the international 
trade literature is that MFN’s assure 
nondiscrimination: 


...the most-favored-nation clause 
conferred no privileges of any great 
importance, for the general rule was 
to treat all nations as equals. Most- 
favored-nation treatment then, meant 
not favored treatment, but merely a 
guarantee against being less favorably 
treated than other foreign nations. 

(Vernon Setser, 1937 p. 69) 


Potential discrimination has also been 
cited to explain BP provisions in long-term 
contracts between natural-gas producers and 
pipelines (Edward Neuner, 1960; R. Glenn 
Hubbard and Robert Weiner, 1986). 

Other authors (Frederic Scherer, 1980; 
David Grether and Charles Plott, 1984; 
Charles Holt and David Scheffman, 1987; 
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Stephen Salop, 1986; Thomas Cooper, 1986) 
argue that two-party BP provisions enhance 
tacit collusion. By committing the firm to 
pay rebates if it ever cuts price, the provi- 
sions reduce competition for new cus- 
tomers. Terrence Belton (1986) demon- 
strates how meet-the-competition provisions 
enhance collusion by committing each firm 
to match the prices of industry rivals. 

Risk sharing is also a proposed motive for 
BP previsions in field markets for natural 
gas. Paul MacAvoy (1962) and Harry Broad- 
man and W. David Montgomery (1983) ar- 
gue that the provisions may result in a more 
efficient allocation of risk by shifting price 
uncertainty from the beneficiary of the 
clause to the benefactor. 

Attention here focuses on the nondis- 
crimination motive for BP provisions. Both 
two- and three-party versions are modeled. 
Alternate hypotheses are raised again in the 
conclusion. 


IJ. A Simple Explanation of Best-Price 
Provisions 


The demand for ownership of a durable 
good is illustrated in Figure 1. The price, 
P(X), is decreasing in the quantity X sold. 
If a monopoly seller produces at zero 
marginal cost, the solution might appear to 
involve selling M units at price P(M) per 
unit, Coase’s revelation is that this solution 
does not hold when future behavior is con- 
templated. Having sold M units, the 
monopolist can lower price to sell addi- 
tional output. Knowing this, prospective 
buyers balk at paying P(M). Unless the 
monopolist can commit not to cut price, no 
output can be sold at any price above 
marginal cost.° 

Now consider the outcome when BP pro- 
visions are offered. Suppose the monopolist 


3Coase’s reasoning has been supported by several 
subsequent authors, including Stokey (1981), Bulow 
(1982), Eric W. Bond and Larry Samuelson (1984), 
Faruk Gul, Hugo Sonnenschein, and Robert Wilson 
(1986), and Kahn (1986). When marginal cost is in- 
creasing, price does not immediately fall to marginal 
cost, but in all cases, the monopolist may be unable to 
reap the full rewards that would be possible through 
commitment to a preannounced production plan. 
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FIGURE 1. DEMAND FOR OWNERSHIP OF A DURABLE GOOD 


begins by selling Q units at price P(Q) per 
unit. If AQ additional units are sold, rev- 
enues increase by P X AQ, but the 
monopolist must rebate Q X AP to previous 
customers. Output is set such that marginal 
rebates equal marginal revenue. This yields 
the standard monopoly outcome. 


' A. An Example 


This result also holds when cost or de- 
mand is uncertain. A hypothetical event (say 
an art show) is funded in part through sales 
of a commemorative lithograph. There is 
only one seller of the prints, the event’s 
sponsor, who produces at zero marginal cost 
and maximizes expected profits. Sales take 


place before (t=0) and after (t=1) the 


event. The sponsor can credibly commit to 
destroy the plate after the second period, so 
there are only two production decisions, qo 
and q}. | 

Potential buyers include 1,000 individuals 
who will pay up to $100 for the print. An 
additional ņ individuals will pay up to $60, 
where 7 is distributed uniformly - over 
[0, 1,000], and observed only after the event 
occurs (i.e., at time ¢=1). Figure 2 shows 
final demand for the print. 

Up to 1,000 prints can be sold for $100 at 
time t= 0, but not if customers foresee the 





possibility of a $60 price one period later. 
The monopclist could commit not to pro- 
duce more than 1,000 prints but would like 
the flexibility to sell additional amounts if 7 
is high. The seller maximizes expected prof- 
its by committing not to sell prints at time 
t=1 unless 7 > 667. The 1,000 prints sold 
at time ¢=Q command a price which re- 
flects the 1/3 probability of a price cut in 
the following period.*- 

Suppose BP provisions are used fistead. 
The seller sets g, = 1,000 and charges $100 
per print. At time t=1, the -seller then 
weighs the revenues from selling  addi- 
tional prints ($60 7) against the rebates 
that would have to be paid to previous 
buyers ($40 x 1,000). The seller sets g,=7 
if and only if n > 667. Otherwise q,=0. 
Because, BP provisions redistribute risk from 
initial buyers.to the seller, the initial price 
and realized profits differ from the scenario 
in which quantity commitments are em- 
ployed. Yet output levels and expected 
profits are the same. 

This example is discussed in further de- 
tail in Section VII. 


“If buyers are risk-neutral and have the same dis- 
count rate, 5, then the initial price equals œo + 
5{(2/3($100) + (1 /3X$60)}, where o-is the rental 
value of the good in the initial period. 
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Final Quantity 


100 +h 


- Ficure 2. FINAL DEMAND FOR THE LITHOGRAPE IN THE EXAMPLE (see text) 


Ill. The Model 


At time t=0 a (new) durable good is 
introduced. It never depreciates and is 
available initially only through a monopoly 
seller. Once purchased, it can be leased or 
resold through perfectly competitive sec- 
ondary markets. 

The total stock of the good at time ¢ is 
Q, and q, = (Q, — Q,_,) is the quantity sold 
by the monopolist at that time. The inverse 
demand for the good’s rental services is 
$, = 6(Q,,¢,), where £, is a random vari- 
able with probability density function f,(¢,). 
Production costs are yy, = y,(q,), where 
yi(q,)20 and y/(q,)>0.° There is a com- 
mon discount rate, 6, and all agents know 
Yp fe and ¢,. 

Output can be priced in a variety of ways. 
Assume that buyers are risk-neutral. Let B{ 
be the payment by a time-t buyer to the 
monopolist at time s>t. The monopolist’s 
choice of pricing conventions must satisfy 


(1) B+E, X 99g! 


s=tf+1 


=Q (Qne) tE, L STIA, Es) 


s={+]1 


for all t. The first left-hand-side term, A, is 


Cost uncertainties could also be introduced. 


the buyer’s initial payment. Expected dis- 
counted payments in subsequent periods 
appear in the second expression and can be 
positive or negative. If the monopolist rents 
the good, then {= 4, for all ¢ and all 
s 2t. With “simple” prices (unaccompanied 
by price guarantees), 8i >0 and 8? =0 for 
S> ft, 

The right side of equation (1) measures 
the expected value of the good’s rental ser- 
vices. The equation therefore constrains the 
monopolist to choose a pricing scheme such 
that expected discounted payments equal 
the expected discounted value of the rental 
services provided. Three pricing . mecha- 
nisms are outlined in this section; all satisfy 
this constraint. 


A. Simple Prices 


Without price guarantees, the price at 
time ¢ is 


(2) P.=¢,(Q,,&) 


+E, E 386-$,(0,,8,). 


=t+1 


The firm’s discounted cash flow from time t 
onward is given by 


(3) IT, = (P,a, p ¥,(4,)} 


o0 


+ 3, 8T Pq, —y,(9,)} 


s=t+] 
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Let {q;*, t= 0,1,...} be the contingent pro- 
duction plan maximizing EI. This plan is 
dynamically consistent if it also maximizes 
EII, for all t. In general, {q7*, t = 0,1,...} is 
not consistent when simple prices are em- 
ployed. 

For reasons to be explained shortly, con- 
sider only cases where 


(4) 6,(Q*, E) = P(Q E1) 


for all ¢. This assumption assures that prices 
do not rise over time. 


B. Infinite-Duration, Two-Party 
Best-Price Provisions 


Suppose BP guarantees extend forever 
but apply only to subsequent transactions 
with the monopolist. Formally, suppose the 
monopolist guarantees anyone with such a 
provision at time ¢ that, for all s and all 
kss—t, >- 


S $ 
(5) LBS LL Bi. 
i=f j=t+k 
In short, the monopolist promises each 
time-t buyer that its cumulative payments 
will not exceed the payments made by any 
subsequent buyer. 

Let p,= 8} be the monopolist’s time-t 
price when BP provisions are offered. ‘Then 
at time t +1, new buyers pay p,,, and time-t 
buyers receive a rebate of (p,,,—p,= 
— Bi. This brings the net cost of the good 
for time-t buyers to p,,,. In the same fash- 
ion, anyone who has purchased a unit of the 
good through time ¢+s-—1 receives a re- 
bate of (p,,,-1— Pias) at time t+s. The 
net price at time t+ s of one unit pur- 
chased at time ¢ therefore equals p,,.,. All 
buyers, regardless of their vintage, pay this 
same “best” price. 

The sale price at time ¢ is determined by 
the following: 


(6) p,-E, Li, 8°-(p,-1— ps) 


s=t+i 


= b,(Q,,€,) 
+ E, 2 8h, Es). 


s=tt+l 
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Through algebraic manipulation, (6) implies 
that 


(7) p, = (1-8) ~'d, = 2, 5°¢,. 


s=Q 


Best-price provisions fully compensate buy- 
ers for any change in the value of the good. 
Hence, p, is set as if inverse demand for- 
ever equals ¢,. 

If the monopolist follows the plan 
{q**, t=0,1,...}, then (4) and (7) imply that 
P,-,2 P, for all t. Thus, the rebate paid to 
each buyer at time t is always nonnegative.°® 

When the monopolist offers two-party BP 
provisions, discounted cash flow from time ¢ 
onward is given by’ 


(8) 0,={p.4,—(-1- p:)Q,-1- ¥:(4,)} 


$ > 5°~){ 0.4, = (Psi ~~ Ps) 


s=f+] 


XO,-1 B Y(as)}. 


C. One-Feriod Meet-the-Competition 
| Provisions 

Now consider the monopolist’s problem if 
buyers are promised meet-the-competition 
(MTC) provisions extending for a single pe- 
riod. These provisions are frequently ex- 
tended to buyers of such consumer dura- 
bles as appliances and electronics. Under 
this arrangement, all buyers paying a, for 
the good at time ż receive a refund of 
(a,—P,,,) at time ¢+1. The seller mod- 
eled here has a monopoly on primary sales, 
so all competition comes through the sec- 
ondary market.® 


While the analysis that follows does not change if 
the assumption given by (4) is dropped, BP provisions 
are rarely observed in practice where prices are rising 
through time. In such cases, it would be necessary for 
the monopolist. to collect “surcharges” whenever p; 
< py. 

It is assumed that p_;=Q_,=0. 

Since the model assumes monopoly, the term “meet 
the competition’ may be somewhat confusing. Here 
atomistic buyers and sellers in the secondary market 
individually have no impact on price. They also have 
rational expectations regarding the monopolist’s quan- 
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At time ft, the monopolist’s price, a@,, is 
determined by 


(9) a,—dEfa,— Pisi) = p(Q, E,) 


+E, x 5° b(O,,85)- 


s=f+1 


From (9) it follows that 
(10) a@,=(1-8)~'d,= E ô$, 
s=0 


As before, price is set as if inverse rental 
demand forever equals ¢,. By (4) and (10), 
(a,— P,,,)290, so buyers always receive 
nonnegative rebates. 

With MTC provisions, the monopolist’s 
discounted cash flow from time t onward is 
given by” 


(11) = (aq, = (æ, = P) - ¥(4,)} 


+E, L 8°" {a,q,—(a,_,—-P,) 


s=t+] 
qs} a Y(4)} $ 


Having described these various options for 
pricing output, attention now turns to their 
impact on the dynamic consistency of the 
monopolist’s plans. 


IV. The Impact of Best-Price and 
Meet-the-Competition Provisions 


This section provides three results. First, 
for any given production plan, price guaran- 
tees can be extended from the outset with- 
out altering expected profits. Second, if the 
monopolist offers two-party BP provisions, 
then the production plan {q*, t = 0,1,...} is 
dynamically consistent. In contrast, one- 
period MTC provisions mitigate but do not 
resolve the monopolist’s problems with dy- 


tity decisions, as well as future prices and rental values. 
Whenever price exceeds the expected value of the 
good’s discounted flow of rental services, anyone hold- 
ing the good attempts to sell it. Whenever price falls 
below the value of these services, the quantity de- 
manded exceeds the total stock of the good outstand- 
ing. 
"It is assumed that æ; = O_,=0. 
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namic inconsistency. Finally, the pricing 
provisions themselves are dynamically con- 
sistent: once the monopolist adopts BP or 
MTC provisions, there is no gain from drop- 
ping them at a later date. 


A. Expected Profits 


For the moment, ignore problems with 
dynamic inconsistency. For any given pro- 
duction plan, the monopolist’s expected rev- 
enue per unit, given in equations (1), (2), 
(6), and (9), is the same. Buyers pay only for 
the expected rental services they consume. 
They may initially pay more when offered 
BP guarantees, but only if they expect fu- 
ture rebates. Formally, it is shown in the 
Appendix that for any plan {g,, t = 0,1,...}, 


Egllg = Eg Qo = Eolo = {$04 si Yo(4o)} 


+ Eo = 5{,Q, = y4a,)} , 
t=] 


B. Consistency of Production Plans 


Two-party BP provisions protect owners 
of the durable good from all future changes 
in its value. When choosing output, the 
monopolist therefore internalizes the change 
in value of all units sold previously. One- 
period MTC provisions compensate buyers 
only from the previous period, so the 
monopolist considers the welfare of only a 
fraction of former customers. Hence, MTC 
provisions mitigate but do not resolve the 
dynamic inconsistency problem. The first 
proposition, proved in the Appendix, states 
these results formally. 


PROPOSITION 1. If the monopolist offers 
infinite-duration, two-party BP provisions 
to all buyers, then the production plan 
{q*, t=0,1,...} is dynamically consistent. 
However, this plan is not in general consis- 
tent when one-period MTC provisions are 
employed. 


One-period MTC provisions are useful but 
not completely effective. Unlike two-party 
BP provisions, however, they require only a 
one-time rebate and may be less expensive 
to administer. One-period MTC provisions 


1068 THE AMERICAN ECONOMIC REVIEW 


may also give the monopolist some discre- 
tion to discriminate intertemporally when 
secondary markets are imperfect (see Sec- 
tion VII). Finally, MTC provisions can be 
coupled with other commitments to en- 
hance their effectiveness. 

One question still remains: does the 
monopolist at any point have an incentive to 
switch pricing plans? In other words, is the 
adoption of best-price provisions dynami- 
cally consistent? 


C. Consistency of Pricing Plans 


The specifications of Q, and T, assume 
that, once the monopolist adopts BP or 
MTC provisions, they are also offered to all 
subsequent buyers. Yet by changing pricing 
policies, the monopolist can alter the timing 
—and perhaps the magnitude—of the re- 
bates paid to previous customers. Could 
such a switch increase expected profits? The 
second proposition, proved formally in the 
Appendix, provides the answer. 


PROPOSITION 2: Suppose the monopolist 
has offered BP or MTC provisions through 
time t—1. Then, at time t the monopolist 
cannot increase expected discounted future 
cash flow by changing pricing policies. 


Price guarantees index the payments for 
units purchased in one period to the pay- 
ments for units purchased subsequently. The 
expected discounted value of the payments 
for units purchased at time ¢ must equal the 
value of the rental services provided [by eq. 
(1)]. The monopolist can alter the timing of 
rebates, but not their (discounted) magni- 
tude. Hence, the monopolist has nothing to 
gain by switching to a different pricing pol- 
icy at any future date. With two-party BP 
provisions, the monopolist’s output decision 
and pricing policy are both dynamically con- 
sistent. With one-period MTC provisions, 
the monopolist’s pricing policy is dynami- 
cally consistent, but the output decision is 
not. 


V. The Length of Period 


Though the model is most naturally ex- 
posited in discrete time, the results may be 
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sensitive to the length of the period (see 
Stokey, 1981; Kahn, 1986). In discrete time, 
the monopolist has a slight ability to com- 
mit; once output has been chosen in one 
period, it cannot be changed until the next. 
As the length of the period decreases, so 
does the monopolist’s ability to commit in 
this manner. 

With one-period MTC provisions, the 
monopolist internalizes the change in value 
of only those goods sold in the previous 
period. If the period is shortened, then the 
monopolist’s production decisions reflect the 
impact on a smaller fraction of all units sold 
previously, and the effectiveness of these 
provisions declines. Yet because the 
monopolist can respond by extending the 
duration of MTC guarantees, period length 
is not important. 

Infinite-duration, two-party BP provisions 
commit the monopolist to internalize the 
market value of all units sold previously. Al- 
tering period length does not diminish the 
effectiveness of this commitment in any way. 


YI. The Relative Merits of Best-Price 
Provisions 


The problem of dynamic inconsistency can 
be addressed in a variety of ways, including 
not only best-price guarantees, but also 
rental arrangements, repurchase provisions, 
and output commitments. International 
commodity agreements (see Sections II and 
VIID routinely extend best-tariff guaran- 
tees, while artists seem to prefer “limited 
editions” of their work. Repurchase agree- 
ments and durable-good rentals are possible 
but rarely observed. This section addresses 
the following question: what factors lead to 
the use of best-price guarantees in some 
circumstances but not others? 

Some advantages of BP guarantees are 
immediately apparent. They often redis- 
tribute risk efficiently and grant the 
monopolist maximum flexibility to choose 
future output. If prices are publicly observ- 
able, then monitoring and enforcement costs 
are low. Unlike rental or repurchase agree- 
ments, BP provisions do not obligate the 
monopolist to reassume ownership of the 
good. Hence, they are more attractive when 
the good has some ex post specificity. 
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Although the model assumes rational and 
homogeneous expectations, best-price pro- 
visions may be especially attractive when 
expectations differ. As an illustration, con- 
sider the following numerical example. As 
before, suppose the seller knows that 7 1s 
uniformly distributed over the interval 
[0, 1,000], but now assume that buyers (mis- 
takenly) believe that 7 is distributed uni- 
formly over [0, 2,000]. If the seller employs 
quantity commitments, the promise is the 
same as before: gq, = 7 if 7 => 667 and q, = 0 
otherwise. But now the parties cannot agree 
on a time-0 price. Buyers believe there is-a 
2/3 chance of a price cut at time t = 1, even 
though the true probability is only 1/3. 
Hence, they are not willing to pay what the 
prints are worth. Rather than accept a low 
price, the seller prefers to defer sales until 
time ¢ = 1. 

With best-price provisions, buyers pay 
$100 for the print at time t = 0 regardless of 
their expectations, and they receive a rebate 
of $40 at time t=1 if and only if the 
monopolist sells additional output. Al- 
though buyers think a price cut is likely, this 
affects neither their willingness to pay nor 
the seller’s output decisions. 

If these expectations are reversed, then 
the seller prefers quantity commitments to 
BP provisions. However, if buyers know they 
are not well-informed, then BP provisions 
are preferred whether buyers are more or 
less optimistic than the seller. To illustrate, 
suppose a retail seller of consumer appli- 
ances knows what prices will be in the next 
period, but customers do not. Even if the 
retailer has no market power, customers are 
reluctant to pay full price, since they might 
miss out on a sale. Best-price provisions 
assure them that they can buy their appli- 
ances now and still take advantage of a 
lower price offered in the next period.” 

Since best-price provisions can be re- 
stricted to specific geographic areas, brand 
names, or time periods, they permit limited 
pursuit of both intratemporal and intertem- 


10 Most retailers of consumer appliances probably do 
not wield significant market power. Hence, asymmetric 
information about future demand may provide a better 
explanation for BP provisions in this setting. 
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poral price discrimination.’ This may help 
to explain why MTC provisions are used 
even though they do not fully address prob- 
lems with dynamic inconsistency. 

There are also clear disadvantages to 
best-price provisions. In practice, they are 
rarely employed when prices are rising over 
time. If $,(Q"*,€,) <b, +Q £1) then 
(p, — P:+1) and (a, — P,,,) are negative. In- 
stead of a rebate, customers pay a 
“surcharge.” In addition to the obvious re- 
luctance of customers to pay such a levy, it 
may be difficult for the monopolist to track 
down former customers. 

If prices are falling monotonically over 
time, then two-party versions of the clause 
require recurring refunds. Best-price provi- 
sions have the greatest appeal, therefore, 
when subsequent payments are either un- 
likely or inexpensive to distribute. 

Perhaps the greatest complications arise 
with heterogeneous products. Best-price 
provisions prevent only intertemporal price 
discrimination. If the monopolist can offer 
subsequent customers higher quality, better 
warranties, free delivery, or other perks, 
then the protection offered by best-price 
provisions may be worth very little. 

This problem can be addressed by adjust- 
ing for product differences and by promis- 
ing most-favorable treatment along other 
economically relevant dimensions of the 
contract. Long-term contracts between nat- 
ural-gas pipelines and producers often 
promise producers the best quality-adjusted 
price and contain prorationing provisions to 
prevent quantity discrimination. Nonethe- 
less, heterogeneity increases the cost of us- 
ing BP provisions. 

Summarizing, several conditions must 
hold before best-price provisions can be em- 
ployed successfully. Prices must be publicly 
observable and must not be rising over time; 
refunds must be either infrequent or inex- 
pensive to distribute; and the product must 
be roughly homogeneous. The provisions are 
relatively more advantageous when buyers 
are more risk-averse or less well-informed 
than the monopolist or when the good has 


“Intertemporal price discrimination would work 
only if secondary markets are imperfect. 
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some ex post specificity. Relative to quantity 
commitments, best-price provisions are es- 
pecially attractive when future contingen- 
cies are difficult to outline ex ante and to 
verify ex post. 


VII. Applications 


My model employs highly restrictive as- 
sumptions, including rational and homoge- 
neous expectations, risk neutrality, perfect 
secondary markets, infinite durability, pure 
monopoly, and a common discount rate. 
These can all be relaxed. In the numerical 
example, best-price provisions work even 
when expectations are irrational and regard- 
less of buyers’ risk preferences. The seller 
has a monopoly over a certain type of print, 
but this hardly constitutes a “pure” 
monopoly. The illustration assumes no sec- 
ondary markets and never mentions individ- 
ual rates of time discount or infinite durabil- 
ity. 
Only three conditions appear to be neces- 
sary: asset durability, seller market power, 
and forward-looking expectations. Under 
these circumstances, the seller must assure 
buyers that the value of their assets will not 
be diluted through excessively high output 
levels in future periods. This section dis- 
cusses the role played by best-price and 
nondiscrimination clauses in settings where 
these conditions arise. 


A. International Commodity Agreements 


A country controls access to its domestic 
market for some good and licenses foreign 
trading partners to sell output there. These 
licenses are long-lived and paid for through 
reciprocal concessions and tariff revenues. 
For example, the United States might li- 
cense Japanese car sales in the United States 
in exchange for a license to market wheat in 
Japan. The agreement might specify a $500 
tariff on each car and $1 tariff on each 
bushel of wheat. 

The value of these licenses depends on 
the number and type of licenses granted 
to other trading partners. After negotiating 
the treaty described above, suppose the 
United States agrees to a $100 tariff on 
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West German cars. This agreement signifi- 
cantly reduces the value of Japan’s license 
to sell cars here. If Japan had contemplated 
this scenario when negotiating their agree- 
ment, they would have demanded compen- 
sation. 

In practice, compensation comes through 
most-favored-nation (MFN) provisions. An 
MEN provision would assure Japan that its 
tariff will be reduced to $100 once the 
U.S.-German accord is signed. 

These provisions are not problem-free. 
First, they promise only nondiscriminatory 
tariffs and can be circumvented through 
quotas and other nontariff trade barriers. 
Second, impcrts are rarely homogeneous, so 
countries can discriminate through their 
product classifications (e.g, the Suzuki 
Samurai jeep could be classified as a recre- 
ational vehicle rather than an automobile). 
Third, countries may wish to discriminate in 
favor of some trading partners (e.g., allies 
or lesser-developed countries), and MFN 
provisions offer only limited opportunities 
for doing so. 

Finally, disputes often arise over the value 
of reciprocal tariff concessions. If the 
U.S.—German accord allows duty-free sales 
of U.S. wheat in West Germany, should 
Japan be obligated to offer the same terms, 
or should its tariff be reduced to $100 re- 
gardless of the German concessions? 

Each of these problems can be addressed 
by adding language to the treaty. Nontariff 
trade barriers can be prohibited, product 
definitions can be more precise, and excep- 
tions can be made for preferred trading 
partners. Because the value of reciprocal 
concessions is difficult to measure, MFN 
provisions typically apply unconditionally. In 
our example, the damages to Japanese car 
makers are the same regardless of the con- 
cessions granied to Germany. Hence, it is 
far simpler ta ignore them when determin- 
ing tariff adjustments.’? 


Unconditional MFN treatment results in “spill- 
over effects.” Japan effectively free-rides on the U.S. 
and German efforts to liberalize trade. Spillover effects 
have been a primary reason why countries have moved 
toward multilateral tariff bargaining. 
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B. The Market for 
Electric Turbogenerators 


Perhaps the most infamous application of 
best-price provisions is in the market for 
electric turbogenerators. Prior to 1963, this 
market was characterized by elaborate col- 
lusive efforts, as well as chronic price wars. 
In 1963, General Electric mtroduced a 
“price protection plan,” and Westinghouse, 
its largest rival, soon followed suit. Each 
plan guaranteed that if the firm gave a 
discount on any new turbogenerator order, 
the same discount would be offered retroac- 
tively on all orders taken within the previ- 
ous six months. At the same time, the firms 
adopted simplified booking procedures to 
standardize pricing and opened their records 
for public scrutiny. 

These policies made it unprofitable to cut 
prices, since any discounts to one customer 
involved rebates to all others. After some 
initial rivalry, the firms established price 
stability, and there was not one price cut by 
either firm until the plans were terminated 
by government order in 1977. 

Although tacit collusion is an obvious ex- 
planation for these provisions (see Section 
II), there is a complementary interpretation: 
potential buyers, aware of recurring price 
wars, did not find cartel behavior credible. 
Whenever possible, they postponed orders, 
hoping to take advantage of the next round 
of discounts. Best-price provisions assured 
buyers that they could place their orders 
and still take advantage of subsequent re- 
ductions. At the same time, they committed 
the firms not to cut price. 

In the conventional explanation, best- 
price provisions enhance cooperation be- 
tween firms at each point in time. Here the 
provisions enable a single cartel to collude 
with itself across time. Best-price provisions 
could be commitments to industry rivals, yet 
they could also serve as commitments to 
buyers. The two explanations are perfectly 
compatible and together lead to a richer 
model. Even if the firms were able to col- 
lude, they would still need a dynamically 
consistent plan; and dynamic consistency 
would not have been an issue if the firms 
were competitors. 
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C. Discrimination Between Stockholders 


Suppose a corporation is financed solely 
through the sale of shares of common stock. 
Let V, be the aggregate value of these shares 
at time t. Then if N, is the number of 
claims outstanding, the price per share is 

12 P Ki 
( ) t N, . 


Now suppose at this time that n,> 0 ad- 
ditional shares are sold in a financial trans- 
action. The firm charges p, per share, and 
the proceeds, p,n,, are retained by the firm. 
The value of the original N, shares is af- 
fected in two ways. First, each share’s pro- 
portionate claim on the firm falls. Instead of 
owning the fraction 1/N,, each share now 
commands only 1/(N,+n,). Second, the 
proceeds are retained, so the aggregate 
value of the firm rises to V,’,'* where 


(13) 


The two effects work in opposite directions: 
the original shares each command a smaller 
fraction of a larger pie. The new price, P’, 
iS 


(14) 


Ve = V+ DN, 


pr Vr V, + Pn, 
i O N+tn ~ N +n, 


If p, = P,, then from equations (12) and (14) 
it follows that P’/=P,. In other words, 


3 By assuming that the exchange is purely financial, 
I rule out the possibility that the stock is offered in 
exchange for services rendered. Thus, I rule out cases 
in which stock is offered to employees, management, or 
board members as part of an overall compensation 
package. I also rule out cases involving contests for 
corporate control, since such contests involve not only 
the exchange of financial assets, but also the manner in 
which real resources are allocated. 

“This discussion merely illustrates the role of dis- 
crimination. In a formal proof, it would be assumed 
that the proceeds of the stock sale, p,n,, are invested 
in assets that can be bought and sold by individual 
investors on the same terms available to the corpora- 
tion. When coupled with other assumptions adopted in 
propositions on the irrelevance of financial policy, this 
assures that the value of the firm changes by exactly 
Pihi- 


1072 THE AMERICAN ECONOMIC REVIEW 


shareholders are indifferent to the sale (re- 
purchase) whenever the terms are nondis- 
criminatory. The revenues retained from the 
sale of stock increase the firm’s value by just 
enough to compensate for the reduction in 
each share’s proportionate claim. If p, < P,, 
then P; < P,. If the sale is discriminatory, 
shareholders are unambiguously worse off. 

Suppose the firm does not promise to 
refrain from discriminatory transactions. If 
investors anticipate discrimination, then 
stock prices and the value of the firm fall 
before the discrimination occurs. 

This last conclusion appears to be at odds 
with propositions showing that financial pol- 
icy is irrelevant. Yet most irrelevance 
propositions, including Franco Modigliani 
and Merton Miller’s (1958), do not address 
the dynamics of financial policy. The propo- 
sition closest in spirit to this discussion is 
outlined by Eugene Fama (1978). Fama 
shows that financial policy has no effect on 
the firm’s value even when it makes no 
commitments regarding future sales of fi- 
nancial claims. However, Fama assumes that 
all transactions take place at market prices 
and thereby rules out discrimination. Fama’s 
result should therefore be amended to say 
that financial policy is irrelevant even if the 
only commitment is to refrain from discrim- 
inatory financial transactions. 

Discrimination becomes especially prob- 
lematic when shareholders are heteroge- 
neous. The firm would like to commit not to 
enter into discriminatory transactions that 
redistribute wealth from one class of share- 
holders to another. By doing so, it lowers its 
cost of capital. Yet discrimination may be 
necessary to compensate shareholders en- 
gaged in costly but value-enhancing contests 
for corporate control (Sanford Grossman 
and Oliver Hart, 1980). Controversies sur- 
rounding targeted share repurchases illus- 
trate the difficulties of pursuing both objec- 
tives simultaneously. 


VIH. Conclusions 


Although the Coase conjecture is ex- 
posited using very specific assumptions, the 
problem of dynamic inconsistency can arise 
whenever a seller (or buyer) of a durable 
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asset possesses market power. Best-price 
provisions represent one mechanism for ad- 
dressing this problem. They are practical 
only under special circumstances but are 
remarkably simple and effective. 

When extended to all buyers, best-price 
provisions guarantee equal treatment. 
Equal-treatment guarantees appear in a va- 
riety of settings and are often justified on 
purely normative grounds. Through the 
model and applications, this paper outlines 
a positive analysis of these provisions. 

While it is individually rational for buyers 
to accept best-price provisions, in equilib- 
rium if appeers that consumers as a whole 
suffer rather than benefit. Yet there are 
various reasons why the reader should not 
draw sweeping policy conclusions from 
this study. First, suppose a durable-good 
monopolist—or monopolistic competitor— 
has large fixed costs and low marginal cost. 
Then, without some means to raise price 
above marginal cost, revenues do not cover 
costs, and the monopolist produces nothing. 
Second, even when best-price provisions 
hurt consumers, they may be less pernicious 
than other alternatives the monopolist may 
employ to commit to the monopoly output. 
Third, in some settings, including interna- 
tional commodity treaties and financial con- 
tracts, best-price provisions and equal-treat- 
ment guarantees are widely credited with 
raising consumer welfare. 

Problems with dynamic inconsistency 
could provide a motive for nondiscrimina- 
tion provisions in other contexts, as well. 
Suppose a firm requires its employees to 
invest in specific and long-lived human capi- 
tal. The return on these investments de- 
pends upon both the wage and the number 
of hours worked. Once the investments have 
been sunk, the firm has monopsony power, 
and workers face the danger that their 
quasi-rents will be expropriated. The firm 
can commit to wage rates and employment 
levels, but this hampers flexibility. 

Instead the firm can promise most favor- 
able treatment. Each worker is guaranteed 
wages and employment conditions at least 
as favorable as those offered to all other 
workers. If workers are heterogeneous, then 
those with more specific investments could 
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be guaranteed a higher wage and first prior- 
ity in the allocation of hours. 

. While this explanation is not implausible, 
risk sharing, tacit collusion, and information 
asymmetries have been forwarded as com- 
peting explanations, and normative consid- 
erations may also play a role. If the model is 
‘to be extended to settings such as these, 
tests may be constructed to evaluate these 
competing hypotheses empirically. 


APPENDIX 


PROOF THAT Elly = EQ = Ifo: 
Substitute equation (2) into (3) and take 
expected values: 


(A1) Batly={| bot Ea 2 5%, ao=votao) 
go] 


tE 8{| #48, > ATE 


t=] s=f+] 


={b090~Yo0(4o)} 


+E £ 5'(6,2,- ¥:(4,)}. 


fx] 


Substitute equation (7) into (8) and take 
expected values: 
(A2) E= 


{(1 se 5)~ podo i Yolda)} 


+E} 8{(1—8)7? 


t=] 


x ($Q, ~ b,~1Q,-1) - ¥:(4,)} 
= {990 ~ Yol4o)} 
+ Eo 2 5{¢,Q, — ya) 
f=] 


After taking expected values, equation (11) 
implies 


(A3) Elo = {2090 — Yo(40)} ~ 8(a9 — EgP1)40 


+ Eo $, d{a,4q,— la, — 


t=] 


Pea Gs ~ Ya) 
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By equations (2) and (9), 


(A4) a, 5(a, ga ER) 


= $, T E, 2 5E-OG.. 


s=t+] 


Substitution of (A4) into (A3) yields 


(A5) ETo= {poao — Yol do)} 
+ Éo x 5 ,Q, a ¥(a,)} . 
t=1 


Comparing (A1), (A2), and (A5), it follows 
that Ey Uy = Ep QQ = Eolo. 


To facilitate exposition, the proof of 
Proposition 2 precedes that of Proposi- 
tion 1. 


PROOF OF PROPOSITION 2: 

Suppose first that the monopolist has of- 
fered MTC provisions through time t—1. 
The monopolist’s cash flow from time £ on- 
ward is given by F,, where 


(A6) = {B:4, PAG 1 7 ay ¥44,)} 
+ ae X Bia) ~n(ad}. 
s=t+1 k=t 


The monopolist chooses the 8’s to maxi- 
mize E,F, subject to the constraint given by 
equation (1). After taking expected values 
in equation (A6), equation (1) can be substi- 
tuted to give the monopolist’s uncon- 
strained objective function: 


(A7) EF, = {14i ~ (p17 Py) G1 > VG} 


+E, $, 88~%b(O,~ Q1) 


sett] 
aw ACOE 


In its unconstrained form, the monopolist’s 
objective function is independent of the 
pricing conventions adopted. 

Now suppose the monopolist has offered 
BP provisions through time t —1. Cash flow 
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from time t onward is G,, where 


(A8) G= {ta È D Pts) - Yla) | 
-È af ( Ste) se 
k=0 


scitl 


and the f’s are determined by the 
monopolist’s BP obligations. The monopolist 
chooses pricing conventions to maximize 
E,G, subject to the constraint given by 
equation (1). 

If the monopolist chooses to continue of- 
fering BP provisions, then £8; = p, and Bi = 
(p,—p,-1) for all s=>t+1. These results, 
together with equations (1), (7), and (A8), 
imply that 


(A9) E,G,=E,0, 


= {0,Q, - Pr—1Qi-1 ~ ¥(4,)} 


T E, » 8S-%b,0, ~ ¥s(45)}- 

s=tt+l 
If the monopolist chooses not to continue 
offering BP provisions, then the rebates paid 
to buyers from periods prior to ¢ are deter- 
mined by equation (5). This equation, to- 

gether with the fact that 

i t—1 
3 BS = Pr-1 


s=k 


for all k <t—1, implies that 


(A10) 6, +E, £ 5S" BS > Pr 


s=t+l1 


e FE, 2 -007p 


s=t+1 


for all k <t—1. After taking expected val- 
ues of both sides of equation (A8), equa- 
tions (1) and (A10) can be substituted in to 
yield 


EG, < (9,Q, = ¥,(41)} 


— y,(4,)}- 


P,-121-1— 


+E, X 86-(4,0, 


s=t+1 
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Comparing this to (A9) demonstrates that 
the monopolist’s expected discounted cash 
flow is at least as high using BP. provisions 
as it would be with any other pricing con- 
vention. 


PROOF OF PROPOSITION 1: 
Let Q* refer -to ©, when the monopolist’ 
follows the plan {q*, t=0,1,...}, and let Q, 
represent these expected cash flows when 
the monopolist switches to the plan {Q,, 
s=t,t+1,...} at time t. Let Q* and Q, 
refer to Q, when the monopolist sets q, = q* 
and q, = q,, respectively, through time t. 


Suppose E,Ĝ, > E,Q*. By equation (A9), 
this implies that 


(A11) pÔ. E y(ã,) 


5, sso, Ô, —y(ã,) } 


s=f+] 


> 6,0F — y,(a;**) 


+E, x 6°—-( 6, O* — y,(a*)}. 


s=t+1 


Equation (A1) can be written as follows: 


(A12) Ello = {6040 — ¥0(40)} 


t— 


+ Ey £ 5°(¢,Q, — ¥;(45)} 


sal 


+ Eyb'{ (6,0, — ¥(4,)] 


$ E, 2, d°-1¢.0, z vas} . 


s=t+4+1 


By (A11) the last right-hand-side expression 
in (A12) is greatest when the monopolist 
follows the plan {g*, s=0,1,...,t-—1} 
through the time r—1 but switches to 
{7,, s=t,t+1,...} for the remaining time. 
The first two right-hand-side expressions are 
not affected by this switch, so it follows that 
EIo is also greatest when the monopolist 
switches to {ĝ,, s=t,t+1,...} at time ¢; but 
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this is a contradiction, since {q7*, t = 0,1,...} 
maximizes E,II,. Hence, EQ, < EQF. 

Now consider E,T,. Note that E T, = EF, 
where E,F, is defined by equation (A7). The 
monopolist at time ¢ chooses {g,, s=f, 
t+1,...} to maximize E,F,. Equation (A7) 
can be rewritten as 


(A13) EL = EF, 
= {,(q, + qi—1) 


— Æi m1 -1 7 ACA) 


+E, $, 8AA Q-Q- Ya). 


s=f{[+1 


Comparison of equations (A13) and (A12) 
reveals that the plan {g,, s =t,t+1,...} that 
maximizes E,I, is not, in general, the plan 
that maximizes Eglo. 
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A Schumpeterian Model of the Product Life Cycle 


By PauL S. SEGERSTROM, T. C. A. ANANT, AND Eirias DinoPpouULos* 


This paper presents a dynamic general equilibrium model of North-South trade 
in which research and development races between firms determine the rate of 
product innovation in the North. Tariffs designed to protect dying industries in the 
North from Southern competition reduce the steady-state number of dominant 
firms in the North, reduce the rate of product innovation, and increase the 
relative wage of Northern workers. (JEL 411, 111) 


In his celebrated “product life cycle” pa- 
per Raymond Vernon (1966) argued that 
many products experience cycles. These 
products are initially discovered and pro- 
duced in developed countries (the North), 
and exported to less developed countries 
(the South). As the techniques of produc- 
tion become more standardized, production 
shifts to less developed countries due to 
lower labor costs. These older products are 
then exported back to developed countries. 

The product-life-cycle hypothesis has at- 
tracted considerable attention among inter- 
national-trade theorists in recent years. In 
this literature, the rate at which an individ- 
ual firm discovers and successfully markets 
new products is either treated as exoge- 
nously given (Paul Krugman, 1979; David 
Dollar, 1986, 1987) or as a “deterministic” 
function of the firm’s expenditures on new 
product development (Robert Feenstra and 
Kenneth Judd, 1982; Thomas Pugel, 1982; 
Barbara Spencer and James Brander, 1983; 
Leonard Cheng, 1984; Richard Jensen and 
Marie Thursby, 1986, 1987). Thus, from the 
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individual firm’s perspective, successful 
product innovation is either effortless or 
guaranteed by large expenditures on new 
product development. In contrast, Joseph 
Schumpeter (1942) stressed that firms com- 
pete with each other to successfully in- 
troduce new products. The recent indus- 
trial-organization literature has followed 
Schumpeter’s lead (see, e.g., Glen Loury, 
1979; Tom Lee and Louis Wilde, 1980; 
Jennifer Reinganum, 1982). In these re- 
search and development (R&D) models, 
there are losers as well as winners because a 
firm can spend substantial resources on new 
product development only to find that an- 
other firm has discovered and patented the 
new product first. 

In this paper, we construct a dynamic, 
general equilibrium model of North-South 
trade that combines the product-life-cycle 
hypothesis with Schumpeter’s (1942) de- 
scription of product innovation. We model 
each R&D race as an “invention lottery” in 
which the probability of winning the race is 
proportional to resources devoted to R&D 
by each firm. The duration of each R&D 
race is a deterministic decreasing function 
of the amount of aggregate resources de- 
voted to R&D. Every time a new product is 
discovered, a new R&D race between firms 
in the North begins. The winner of each 
R&D race earns dominant firm profits for 
an exogenously given patent period, after 
which perfect competition prevails. Firms in 
the North choose how much labor to hire 
for R&D by maximizing expected dis- 
counted profits, and consumers maximize 
their discounted lifetime utility. 
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We show that a unique steady-state equi- 
librium exists in which the number of new 
products, consumer expenditures, and as- 
sets are all constant over time. In the steady 
state, Northern workers earn higher wages 
than their Southern counterparts if the 
South has a sufficiently large fraction of the 
world labor force. Moreover, the pattern of 
trade continuously changes with each prod- 
uct initially being exported and then later 
imported by the North. 

Endogenizing the rate of technological 
change generates some surprising compara- 
tive steady-state results in our model. When 
wages in the North and in the South are 
equal, an increase in the patent length (or a 
decrease in the rate of technology transfer 
to the South) increases the rate of product 
innovation in the North. This result is con- 
sistent with the partial-equilibrium indus- 
trial-organization literature on R&D com- 
petition, because an increase in the patent 
length increases the reward for winning an 
R&D race. However, when Northern work- 
ers earn higher wages than Southern work- 
ers, an increase in the patent length de- 
creases the rate of product innovation in the 
North. The increase in the patent length, by 
itself, increases the reward for innovative 
activity, but Northern wages rise more than 
enough to offset this effect. 

Unlike the previously mentioned studies, 
we also examine the effects of tariffs de- 
signed to protect dying industries in the 
North from Southern competition. When 
Northern workers earn higher wages than 
Southern workers, we find that an increase 
in the number of industries being protected 
in the North leads to higher relative wages 
for Northern workers and a slower rate of 
innovation. Thus, we are able to theoreti- 
cally link protectionist trade policies with 
slower economic growth. 

The rest of this paper is organized as 
follows: In Section I, the dynamic general 
equilibrium model of North-South trade is 
presented. In Section IJ, we characterize 
the steady-state equilibrium of the model. 
The relationship between relative labor en- 
dowments and steady-state relative wages is 
explained in Section III. Section IV ana- 
lyzes the effects of changes in patent length 
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and tariff protection on the steady-state 
equilibrium. Finally, our conclusions are 
presented in Section V. 


I. The Model 


Consider a world consisting of two coun- 
tries: the North and the South. Let LN and 
L’ be the aggregate endowments of labor in 
the North and in the South, respectively. 
These labor endowments do not change over 
time. 

In this world, there is a countably infinite 
set of product N ={1,2,3, ...}. At any point 
in time ¢ ©[0,), these products can be 
partitioned into three sets: the set of prod- 
ucts that any firm in the world knows how 
to produce, the set of products that only 
one firm in the world knows how to pro- 
duce, and the set of products that no firm in 
the world knows how to produce. Firms that 
produce products in the second set are 
called dominant firms. 

At time t, every firm that knows how to 
produce product j€ N has the same pro- 
duction technology. Constant returns to 
scale prevail in production, with one unit of 
labor producing one unit of product j. La- 
bor is the only factor of production, and all 
workers in the world are equally productive. 
Factors of production are not mobile inter- 
nationally. 

At time t=0Q, there are n products for 
which the production technology is common 
knowledge, and the remainder of the prod- 
ucts have unknown production technology. 
Time t=0 represents the beginning of a 
sequence of R&D races between firms in 
the North. Only workers in the North are 
capable of doing R&D-type work, and 
therefore only firms in the North compete 
in these R&D races. Every time an R&D 
race in the North ends, a new R&D race 
immediately begins. At the beginning of the 
jth R&D race, each firm i in the North 
must decide how much labor LẸ} to devote 
to R&D. This choice by firm i represents a 
commitment to employing LẸ units of R&D 
labor for the duration of the jth R&D race. 
The firm that wins the jth R&D race be- 
comes the sole producer in the world of 
product n+ j for a time period of length 
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T > 0. After this “patent” expires, the pro- 
duction technology for product n+] be- 
comes common knowledge.! 

We. model each R&D race as an “inven- 
tion lottery.” Each of the L? worker- 
researchers participating in the’ jth R&D 
race is equally likely to discover product 
n+ j at time Ê = A(LẸ) after the beginning 
of the jth R&D race; where LÈ =¥,L¥ is 
the aggregate labor devoted to R&D. Thus, 
at the end of the jth race, nature draws one 
of the LẸ “lottery tickets” and one of the 
LR worker-researchers discovers the new 
product. The firm that employs the winner 
earns dominant firm profits until its patent 
expires. Firm 7 wins the jth race with prob- 
ability Li, / L*. 

By modeling the R&D process as an “in- 
vention lottery,” we capture two features of 
R&D races that we feel are important. First, 
individual firms investing in R&D face an 
uncertain return; there are winners and 
losers. In contrast, with other models of 
technological change and international trade 
(see, e.g., Jensen and Thursby, 1986; Gene 
Grossman and Elhanan Helpman, 1989; 
Feenstra and Judd, 1982), a firm is not 
guaranteed success in developing a new 
product by spending some fixed sum of 
money on product development. Secondly, 
new products tend to be discovered faster 
and at a greater discounted cost, as more 
resources are devoted to R&D [this is im- 
plied by properties of the A(-) function to 
be specified shortly]. There is an intertem- 
poral trade-off associated with R&D activ- 
ity. From the individual firm’s perspective, 
as it spends more money on R&D (a flow 
expenditure), its possibility of winning the 
race increases, other firm’s probabilities of 
success decline, and the race ends sooner.” 


'T is inversely related to the rate of technology 
transfer in Krugman (1979) and Jensen and Thursby 
(1986). “Patents” need not be given a literal interpre- 
tation. The patent length T serves as a proxy for all 
relevant factors that impede technology transfer be- 
tween the North and South. 

*The deterministic length f= hCL®) of each R&D 
race is, admittedly artificial, but as will become clear, 
the main results in the paper are driven by factor 
market constraints, which would be present whether 
the length of each R&D race were deterministic or 
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In this model, at each point in time f, 
wages for workers in each country are de- 
termined by competitive market forces. We 
set the equilibrium wage rate for workers in 
the South equal to one and let w denote the 
equilibrium relative wage of workers in the 
North. Both production workers and R&D 
workers in the North are paid the same 
wage w. 

In addition to the absence of interna- 
tional labor mobility, we assume that pro- 
duction of goods protected by patents takes 
place only in the North. In other words, 
Southern firms can produce a good only 
after its patent protection has expired. Pos- 
sible institutional justification for this as- 
sumption would be that enforcement of 
patent laws in the South is considerably 
weaker than in the North and the labor 
market within the South constitutes an ef- 
fective channel of technology diffusion. 
Thus, in the absence of effective patent 
protection in the South, a Northern domi- 
nant firm producing in the South faces the 
risk that some of its workers might establish 
another firm manufacturing the same prod- 
uct. 

Infinitely lived consumers maximize total 
lifetime utility. Each consumer has an iden- 
tical time-separable utility function 


(1) U= f e™logu(:) dt 
0 


where p> 0 is the constant subjective dis- 
count rate and u(-) is an instantaneous 
utility function.* We adopt a particular 


stochestic. To analyze tractably the effects of commer- 
cial policy in a general equilibrium setting, we also 
abstract from certain interesting features of R&D races 
(established-firm advantages, variable R&D expendi- 
tures over time, imitation in spite of patent protection, 
etc.) that have been extensively studied in the partial- 
equilibrium industrial-organization literature. 

Allowing dominant firms to produce in the South 
could generate multinational firms along the lines pro- 
posed by Grossman and Helpman (1989). Multina- 
tional activity in our model results in the wage being 
equalized between the North and South unless all 
Northern labor is engaged in R&D. 

“The same form of total lifetime utility is used by 
Grossman and Helpman (1989). 
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form of u(-), 


n oo 
(2) U(X4,X2,X3,...) = 11 | L atjan: 
j= ; 


i=0 


This is a generalized symmetric Cobb- 
Douglas utility function where n can be 
interpreted as the number of product 
groups. Since products within each group 
are perfect substitutes, we call this the CDP 
(Cobb-Douglas with perfect substitutes) 
utility substitutes) utility function. Product 
group j (j=1,2,3,...,m) consists of prod- 
ucts j, 7+ j, ...; and a>I1 represents the 
extent to which each new product improves 
upon existing products in the same product 
group. 

To illustrate the effect of product innova- 
tion on consumer utility, suppose that ini- 
tially there are n products available for 
consumption. Given time separability, con- 
sumers are, in effect, maximizing the utility 
function U = x; X3 X3... X„ at that instant in 
time. The discovery of product n +1 means 
that consumers are now, in effect, maximiz- 
ing the utility function U =(x,+ ex,,,) 
X_X3...xX,- If the equilibrium prices of 
products 1 and n +1 both equal one, which 
would be the case if both products 1 and 
n+1 were produced competitively, then no 
consumer would purchase product 1 (given 
a > 1), and it would become obsolete. Thus, 
new products substitute perfectly for old 
products, and product innovation in our 
model takes the form of superior products 
replacing inferior products.° 

We assume that there is a capital market 
in the North which supplies the savings of 
Northern consumers to firms engaged in 
R&D. The equilibrium interest rate r(t) 
clears the capital market at each point in 
time ¢. Firms borrow funds from this market 
to pay workers as the R&D is done. Each 
firm issues a risky security that yields a 
positive return if it wins and a negative 
return if it loses an R&D race. Assuming 


Nancy Stokey (1988) models product replacement 
in a different context. The rest of the product-life-cycle 
literature has treated product innovation as being the 
introduction of greater variety. 
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risk neutrality and perfect competition in 
R&D, Northern firms enter each R&D race 
until expected discounted profits are driven 
to zero. Southern consumers are not al- 
lowed to participate in the capital market, 
and therefore at each instant of time, their 
income equals their expenditure. If this as- 
sumption is relaxed, then Southern savings 
would end up financing part of the North’s 
R&D expenditure, a result that is contrary 


_ to empirical evidence.° 


Free trade is assumed to exist between 
the North and the South throughout time, 
and products are assumed to be non- 
storable. Furthermore, at any time f¢, per- 
fect competition prevails in the market for 
each product whose patent has expired. 
Thus, the market price for all such products 
equals the marginal cost of production in 
the South (one). Given the consumer pref- 
erences, one unit of product j gives each 
consumer as much utility as œ units of prod- 
uct j — n. When both products are competi- 
tively produced and sell at the same market 
prices (equal to one), the competitive mar- 
ket for product j renders product j—n 
obsolete. _ 

The endowment of labor in the North L^ 
is assumed to be sufficiently small so that, 
even if all the workers in the North did 
R&D-type work, the number of dominant 
firms would be less than the number of 
product groups n. That is, ACLN)n > T. This 


: condition guarantees that there are never 


two dominant firms producing products in 
the same product group.’ At time t, the 
dominant firm producing product j must 


SAIl the comparative steady-state results concerning 
the effects of changes in labor endowments, patents, 
and tariffs on the number of dominant firms, the rate 
of innovation, relative wages, and world assets would 
be unaffected if the capital market were international. 
Nor would the magnitudes of these variables be af- 
fected. However, the distribution of world assets be- 
tween the North and the South and the pattern of 
trade in the steady state would change. 

Even if innovations did not occur in the previously 
described sequence, all the results in the paper would 
be unaffected if product j is only discovered when 
product j— 7 is competitively produced. For example, 
it is possible that certain groups do not experience any 
innovation at all. 
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compete only against a competitive fringe of 
firms producing product j — n. 

The dominant firm and firms in the com- 
petitive fringe simultaneously set prices, and 
we solve for a Bertrand-type Nash equilib- 
rium. Let EW denote instantaneous world 
expenditure. Given the instantaneous CDP 
utility function, world expenditure at time t 
on products in product group j is EV/n. 
With the equilibrium price of one being 
charged by firms in the competitive fringe, 
the dominant firm has zero sales if it charges 
a price p? greater than a. On the other 
hand, the competitive fringe has zero sales 
if the dominant firm charges a price p° less 
than a. If p= qa, then consumers are in- 
different between spending EY /n on prod- 
uct j and spending E“ /n on product j — n. 
We assume that all the indifferent con- 
sumers buy from the dominant firm (all 
rules for rationing the demand of indiffer- 
ent consumers among firms are somewhat 
arbitrary). Then dominant-firm profits are 


(3) w°(p%) 


B 0 if po >a 
(po-w)E™/psn ifpo<a. 
These profits are clearly maximized where 
p*=a. Thus, in the Nash noncooperative 
equilibrium in prices that we examine in the 
rest of this paper, each dominant firm pro- 
duces q= EW/an and earns profits 


(4) ‘= 





The competitive fringe constrains each 
dominant firm from charging prices higher 
than a. 

Several restrictions are placed on the A(-) 
function that defines the R&D technology. 
First, A(-) is assumed to be continuously 
differentiable with A’(-)<0. This guaran- 
tees that product innovation occurs at a 
faster rate when firms in the North devote 
more resources to R&D. Second h = h(0) < 
+o; that is, some product innovation oc- 
curs even if no resources are devoted to 
R&D. Third, h(L®)> 0 and h"(L®)> 0 for 
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all LÈ >0; that is, no matter how much 
labor is devoted to R&D, innovation never 
occurs instantaneously. Fourth, 


d 
(5) aR LR (ern ~1)>0. 


Notice that [°,:pxwLRe dt = WLR (err 
—1)/p is the discounted labor cost of de- 
veloping a new product (discounted to the 
end of the R&D race) when the market 
interest rate r(t) equals each consumer’s 
subjective discount rate p. We show in the 
next section that r(t) = p in the steady-state 
equilibrium. Thus, this condition states that 
the appropriately discounted labor costs of 
developing a new product rise as firms try to 
speed up the process by devoting more re- 
sources to R&D. Equation (5) will hold if 
the AČ) function is downward sloping but 
sufficiently flat. Fifth, we make a technical 
restriction 


(1-e7°T)h( LN) 
EN( cht) —1)T 


which will also hold if the A(-) function is 
sufficiently flat. As shown in Appendix A, 
condition (6) is sufficient but hardly neces- 
sary for the steady-state equilibrium to be 
unique. 

Finally, we assume that the labor force 
in the North (L%) is sufficiently large rela- 
tive to the labor force in the South (L°) so 
that 


(6) —nh'(0)< 


a na LN T 
= =e 

(7) L>+aLN h 

As will become clear in Section II, inequal- 

ity (7) guarantees that, in any steady-state 

equilibrium, aggregate R&D expenditures 

are strictly positive. 


Ii. The Steady-State Equilibrium 


In this section, we show that a unique 
steady-state equilibrium exists for the dy- 
namic, general equilibrium model of 
North-South trade. In this steady state, the 
number of dominant firms m, the relative 
wage of Northern workers w, the profit flow 
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of each dominant firm 7°, the aggregate 
labor devoted to research and development 
LÈ, world expenditure EW » world wage in- 
come ZÙ, Northern assets AN , and the mar- 
ket interest rate r are all positive constants 
over time and are interrelated in several 
specific ways. 

World expenditure EV, Northern assets 
A~, and the equilibrium interest rate r must 
be consistent with the consumer’s savings— 
consumption decisions over time. Let fy 
represent the beginning of an R&D race 
where J innovations have already occurred. 
The representative consumer’s discounted 
future utility from expenditure path E(t), 
t E[t) is 


a wtG+Di 


= 








(8) U=} f elg ——— |d 
i=0 ft+if 
+T(J, ta) 


where f=h(L®) is the length of each 
steady-state R&D race and m is the num- 
ber of products produced by dominant firms, 
Because the m products produced by domi- 
nant firms are sold at price p‘ =a and the 
n—m competitively produced goods are 
sold at price p° =1, every time an innova- 
tion occurs, the consumer’s instantaneous 
utility increases by factor a. From the point 
of view of future decision making T(J, t) = 
fe~” log u(-) dt is a constant. Appendix B 
shows that optimal consumer behavior is 
characterized by a constant expenditure 
path over time when the steady-state inter- 
est rate equals p, the consumer’s subjective 
discount rate. Furthermore, it is shown in 
Appendix B that the relationship among 
steady-state uae E™, assets AN and 
wage income IY 


(9) EW=pAN+I1™, 


Each consumer spends his wage income and 
interest earnings from his assets at each 


8World assets equal Northern assets and Southern 
expenditure equals Southern income, because South- 
ern consumers do not participate in the capital market. 
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instant in time. These assets have been ac- 
cumulated before the economy reaches the 
steady-state equilibrium.’ In other words, 
equation (9) is just a breakdown of steady- 
state expenditure by income source. 

It is perhaps surprising that, although the 
instantaneous utility function is character- 
ized by periodic jumps (caused by innova- 
tions), there nevertheless exists a steady 
state with constant expenditures and a con- 
stant interest rate over time. However, the 
optimal expenditure path is derived from 
the marginal utility of expenditure function 
(see Appendix B), which is invariant with 
respect to jumps caused by innovations. 

Wage income in the world consists of 
income from production work and income 
from R&D work: 
(10) IW = wh + TS. 

Notice that equation (10) implies that R&D 
workers are paid concurrently. Because 
goods are nonstorable, world GNP must 
equal world expenditure EW 
(11) EV = mrt + w( LN ~ LB) + ZS, 

The first two terms represent the value 
of Northern production, and the last term 
equals the value of Southern production. 

Expected discounted profits of firm iina 
typical R&D race are 


(12) -wÈ |1- e77] 
R 
+ ak -Ph LI] — e~T], 
L 


Firm i must pay each of L? workers the 
wage w for the duration A(LE) of the R&D 
race. With probability LÈ / LF the ith firm 
wins the race and earns profits a? until its 
patent expires. Perfect competition and free 
entry in each R&D race drives expected 
discounted profits of each firm to zero. 
Summing over all firms in the R&D race, 
we obtain 


(13) —wLR[1- eZ] 


+ m'e LD] ee] = 0. 
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In other words, aggregate profits discounted 
to the beginning of an R&D race are equal 
to zero. 

Using equations (10), (11), and (13), it can 
be shown that the value of aggregate assets 
iS 


mr? — wL® 


(14) AN = 
p 
ai m E 
= — D (1—- e°”). 
P jw 


Equation (14) implies that A > 0. Assets 
must be positive in the steady state, because 
Northern consumers saved in the past to 
finance the innovation process that led to m 
dominant firms. 

We use geometric techniques to derive 
the solution and perform comparative 
steady-state analysis. Combining equations 
(4), (11), and (13) yields 


(15) m=Z(L*¥,w) 


na [w(LN~ L®)+25|(1-e7°7) 
E WLR (eU 1) 


Ill 





a—WwW 


With w fixed, Z can be interpreted as the 
steady-state zero-profit condition expressed 
in (m, LÈ) space. Given the assumptions 
about A(-) in Section I, the partial deriva- 
tives are unambiguously signed: ðZ /aL® > 
0,0Z /dw > 0, and lim, _,,Z(L,w) = —œ for 
any w>1. 

The function m= Z(L®,w) increases in 
LF for the following reason: from condition 
(5), when LÈ increases, the discounted cost 
of developing an innovation L®(e°"“ —1) 
increases. Equation (13) implies that to 
maintain zero discounted profits, dominant 
firm profits m4, which represent the reward 
for winning an R&D race, must be higher. 
For a given wage w, 7° is higher only if 
world expenditure EW is higher [eq. (4), 
and from equation (11), there must be more 
dominant firms for world expenditure to be 
higher. In other words, firms can only afford 
to devote more resources to R&D if there 
are more dominant firms earning positive 
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economic profits and, thus, higher world 
income and expenditure. 

For a given L®, an increase in w leads 
to a proportionate increase in 7° by equa- 
tion (13). From equation (4), world expendi- 
ture must increase more than proportion- 
ately. Since world wage income w(LN — LÈ) 
+L increases less than proportionately with 
an increase in w, equation (11) implies that 
m must increase. Thus, for a given L¥, 
m = Z(L®,w) increases in w. In other words, 
for firms to justify paying their R&D work- 
ers higher wages, world expenditure on their 
products must increase, and this can only 
occur if there are more dominant firms 
earning profits. 

To maintain m dominant firms in the 
steady state, each time a patent expires, a 
new product must be discovered. Thus, dur- 
ing the period of time T, m new products 
must be discovered. This generates the 
steady-state R&D supply function in 
(m, L™) space: 


(16) 


Given the properties of A(-), the R&D sup- 
ply increases in L®; aR /@L® > 0 and R0) 
=T/h>0, because some product innova- 
tion occurs even if no resources are devoted 
to R&D. Finally, R(LN)=T/ACL®) <n, 
which guarantees that the number of domi- 
nant firms is less than the number of prod- 
uct groups. 

The relative wage w is determined by 
competitive market forces. Depending on 
the distribution of labor endowments be- 
tween the North and South, the steady-state 
relative wage w can exceed or be equal to 
one. The relative wage w cannot be less 
than one, since Northern workers are as 
productive as Southern workers and only 
Northern workers can do R& D-type work. 

If w>1 in the steady-state equilibrium, 
then m products are produced exclusively 
in the North by dominant firms and n—m 
products are produced exclusively in the 
South by competitive firms, with one prod- 
uct from each product group being pro- 
duced at any point in time. By symmetry, 
for each product that the South pro- 
duces, the aggregate output is L°/(n — m), 


m= R(L®) =T/h(L). 
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and the equilibrium price is one. Since 
this production must satisfy world demand, 
L5 /(n—m)= E™/n. Thus, 


FS 
(17) 


mL = 
LR + —— = LN 
a(n—m) 

whenever w > 1. Equation (17) states that 
all workers in the North are either engaged 
in R&D or manufacturing for dominant 
firms. It implicitly defines m = F(LB). F 
can be interpreted as the steady-state labor 
market constraint in (m, LÈ) space. Clearly 
dF /aL® <0. Given equation (17), any in- 
crease in LÈ must be matched by an equal 
decrease in the aggregate output of domi- 
nant firms. This can only happen if both 
world expenditure and the number of domi- 
nant firms decrease. When w>1 in the 
steady-state equilibrium, the three graphs 
Z(-), RC), and F(-) must simultaneously 
intersect. This case is illustrated by point A 
in Figure 1. 

On the other hand, if a competitive prod- 
uct is produced in the North, it must be that 
Northern and Southern production workers 
get paid the same wage w=1; that is, 
LE + mE“ /na< L". Then, the aggregate 
output for the typical Southern-produced 
product is L°(n—m). However, this pro- 
duction does not have to satisfy demand; 
that is, L5/(n—m)< E™/n. Thus, when 
w =1, the graphs Z(-) and R(-) must inter- 
sect at a point where the labor market con- 
straint m < F(L®) is satisfied. This case is 
illustrated by point B in Figure 2. It is 
proved in Appendix A that a unique 
steady-state equilibrium exists in both cases. 

In this steady-state equilibrium, each 
product j€ N experiences a Vernon-type 
product life cycle. Once discovered, each 
product is produced in the North by a domi- 
nant firm for a period of length 7. Then 
production shifts to a competitive industry 
in the South Gf w>1). Eventually each 
product becomes obsolete, and world pro- 
duction ceases. 


HI. Labor Endowments and Wages in the 
Steady-State Equilibrium 


In this section, we examine the relation- 
ship between relative labor endowments and 
steady-state relative wages. By varying the 
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FIGURE 1. THe STEADY-STATE EQUILIBRIUM 
WITH w>tIl 





FIGURE 2. THe STEADY-STATE EQUILIBRIUM 
WITH w=] 


Southern labor force LS, we show that if LS 
is sufficiently small the steady-state relative 
wage w equals one and that if LS is above 
some critical value the steady-state relative 
wage w exceeds one. Furthermore, the com- 
parative steady-state effects of an increase 
in the Southern labor endowment depend . 
critically on which case we are in. To see 
this, consider the steady-state effect of a 
once-and-for-all increase in the Southern 
labor force L*. Increasing L5 causes both 
the zero profit condition Z(L*,1) and the 
labor market constraint F(L®) to shift down. 
If the steady-state relative wage w* equals 
one [and eq. (17) holds with strict inequal- 
ity], then an increase in LS increases the 
steady-state labor force engaged in R&D 
(LE) and increases the steady-state number 
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FIGURE 3. EFFECTS OF INCREASE IN SOUTHERN 
LABOR ENDOWMENT WHEN W = 1 


of dominant firms in the North (m). This is 
illustrated by the movement from point A 
to point B in Figure 3. However, for suf- 
ficiently large L°”, the labor market con- 
straint becomes binding, and the steady- 
state relative wage begins to rise. With w* 
>1, an increase in LS decreases the 
steady-state labor force engaged in R&D, 
decreases the steady-state number of domi- 
nant firms in the North, and increases the 
relative wage of Northern workers.’ This 
case is illustrated by the movement from 
point A to point B in Figure 4. 

The intuition behind this set of results is 
easy to explain. When the steady-state rela- 
tive wage w* equals one and equation (17) 
holds with strict inequality, at each point in 
time f, some Northern workers produce 
competitive products that are also produced 
in the South. An increase in the Southern 
labor force L° increases Southern income 
IS and Southern expenditure and thus 
increases world expenditure EW. As a re- 
sult of the increase in world expenditure, 
dominant firms want to produce more 
(qi = EV /na) and to hire more production 
workers. Since dominant-firm profits [mt = 
(a —-w)E™/an] increase, perfect competi- 
tion in each R&D race induces firms to 
devote more labor L® to R&D. With a fixed 


An increase in LS, given w, shifts down Z(-) (not 
shown in Fig. 4). Since the final equilibrium is at point 
B, w and Z(-) must appropriately shift up. 
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FIGIRE 4. EFFECTS OF INCREASE IN SOUTHERN 
LABOR ENDOWMENT WHEN w > 1 


endowment of labor LN in the North, the 
increased employment of production work- 
ers by dominant firms and the increased 
employment of R&D labor are exactly bal- 
anced by a decreased employment of pro- 
ducticn workers by competitive firms in the 
North. However, when the steady-state rela- 
tive wage w* exceeds one, this reallocation 
of labor within the North in response to an 
increase in L° is not possible, because there 
are no workers in the North producing com- 
petitive goods. Without any change in w*, 
an increase in L’ leads to excess demand 
for labor by firms in the North. It is still 
true that an increase in the Southern labor 
force L’ increases world expenditure E™, 
and as a result, dominant firms want to hire 
more production workers (qf = E’/na). 
Thus, the relative wage w* must rise enough 
[and dominant-firm profits ma? = (& — 
w)E™ /naq fall enough] so that firms in the 
North hire fewer R&D workers; when firms 
hire fewer R&D workers, this leads to a 
new steady-state equilibrium with fewer 
dominant firms. 


IV. The Effects of Patents and Tariffs 


First, consider the steady-state effect of a 
once-and-for-all increase in the patent 
length T (or a decrease in the rate of tech- 
nology transfer to the South). Increasing T 
causes the zero profit condition Z(L®,1) to 
shift down and causes the R&D supply 
function RCL®) to shift up but leads to no 
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change in the labor market constraint 
F(L®). If the steady-state relative wage w* 
equals one [and eq. (17) holds with strict 


inequality], then an increase in T increases ` 


both the steady-state labor force engaged in 
R&D and the steady-state number of domi- 
nant firms in the North. This case is illus- 
trated in Figure 5. Increasing T increases 
the reward for innovative activity. Firms re- 
spond to this incentive by increasing the 
resources they devote to R&D. However, .if 
the steady-state rélative wage w* exceeds 
one, then an increase in T decreases the 
steady-state labor force engaged in R&D, 
increases. the steady-state number of domi- 
nant firms in the North, and increases the 
relative wage of Northern workers. This is 
illustrated in Figure 6. This counterintuitive 
result is explained as follows: increasing T 
increases. the demand for -production work- 
ers by dominant firms in the North, because 
dominant firms have longer lives. Since LN 
is- fixed, the increased employment of pro- 
duction workers in the North must be ex- 
actly balanced by a:decreased employment 
of R&D workers. The relative wage w* 
must rise enough and dominant-firm profits 
ar: fall-enough so ‘that profit-maximizing 
firms in the North appropriately reduce ten 
R&D expenditures. 

Because each product experiences a Ver: 
non-type product life cycle in the steady- 
state equilibrium, the international trade 
pattern repeatedly changes over time. In- 
dustries in the North die. in the sense that 
production of, particular products ceases. 
Other industries in the North are born when 
new products are. discovered. Thus, in ‘this 
Steady-state equilibrium, production work- 
ers in the North repeatedly lose their jobs 
and must find employment in other sectors 
of the economy. Given this scenario, tariffs 
designed to save the jobs of. production 
workers in dying industries would have con- 
siderable political :support. 

We now relax the previous assumption 
that free trade prevails between the North 
and the South throughout time and explore 
the comparative steady-state effects’ of tar- 
iffs designed to protect dying industries in 
the North.from Southern competition. We 
will assume that the labor force LX in the 
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FIGURE 5. EFFECTS OF INCREASE IN PATENT 
LENGTH WHEN W= 1 





FIGURE 6, Biter OF a IN PATENT 
LENGTH WHEN W> 1 


North is sufficiently small so,that the 
steady-state relative wage of Northern pro- 
duction workers w* .exceeds one. This as- 
sumption is supported by empirical evi- 
dence.!? At each time t, the government in 
the North imposes per-unit tariffs on the 
importation of the. m products whose 
patents have most recently expired. The 
government chooses 7 as its policy instru- 
ment; that is, it chooses how many. indus- 
tries-in.the North to-protect from Southern 
competition. For these .tariffs to have any 


‘Keith Maskus (1989) and Christopher Clague 
(1988) present evidence showing that the wage of un- 
skilled workers in some LDC’s is-about one-tenth that 
of unskilled Northern. workers. 
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effects on employment in the m Northern 
industries, each per-unit tariff must be at 
least w* —1. In other words, each tariff 
must be prohibitive. We will assume that 
this holds for each protected industry. As a 
result, there is no international trade in any 
of these m protected products, and no gov- 
ernment revenue is generated by the tariffs. 
In the steady-state equilibrium with m 
products protected, m products are pro- 
duced exclusively in the North by dominant 
firms, #1 products are produced both in the 
North and in the South, and n— m -— mM 
products are produced exclusively in the 
South by competitive firms. Since Southern 
income 7° still equals LS in equilibrium, the 
South must produce LS/n units of each 
of the m protected products for domestic 
consumption. Since Southern production 
must satisfy world demand for each of the 
n—m-m products, by symmetry, 


Ts 
zs aE 
(18) n è n-m—m 

must be satisfied. The right-hand side of 
equation (18) represents how much South- 
ern labor must be used to produce each of 
the n-m -— M products that are produced 
exclusively in the South. 

Steady-state equations (15) and (16) re- 
main unchanged by the introduction of tar- 
iffs. Given that w> 1, equation (17) be- 
comes 


EY AEN _ 
LR 4+.——__ + = LN, 
na 





(19) 


nw 


Using the identity EW = EN + JS and sub- 
stituting equation (18) into (19) we get 








= M 
aa [Dlie 
R m | 
(20) E5 +| — + — | -——— 
a wjn-m~-mM 
a FS 
N 
nw 


This equation implicitly defines the new 
m= F(L®,*,w) function. It is easily veri- 
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FIGURE 7. EFFECTS OF INCREASE IN THE NUMBER 
OF PROTECTED INDUSTRIES WHEN w > 1 


LR 


fied that 3F/əLE <0, dF /dm<0, and 
oF /ow > Q. Furthermore, when m= 0, the 
new F(-) function coincides with the old 
F(-) function. 

Suppose that in the initial steady-state 
equilibrium m=0 and w=w*>1. This is 
illustrated by the intersection of the m= 
Z(E, w), m= R(LE), and m= F(L¥, mw) 
functions at point A in Figure 7. An in- 
crease in m shifts down the labor market 
constraint F(-). Now there is an excess de- 
mand for Northern labor at point A. The 
steady-state relative wage w must rise above 
w*, shifting up both the zero profit con- 
dition Z(-) and F(-) until a new inter- 
section is established (at point B, with 
w=w'>w*), 

Thus, we can conclude that when North- 
ern production workers earn higher wages 
than their Southern counterparts, an in- 
crease in the number of protected indus- 
tries in the North (7%) decreases the steady- 
state rate of product innovation in the 
North, decreases the steady-state number of 
dominant firms in the North, and increases 
the steady-state relative wage of Northern 
production workers. 

The intuition behind this result is easy to 
explain. An increase in the number of pro- 
tected industries in the North raises the 
demand for Northern production workers. 
With the Northern labor endowment fixed, 
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the labor devoted to R&D must decline. 
This necessitates an increase in Northern 
relative wages, which raises production costs 
for dominant firms and lowers dominant- 
firm profit flows. As a result, R&D becomes 
less profitable, and the steady-state number 
of dominant firms (winners of R&D races) 
declines. Notice that in the case of w=1, 
increasing the number of protected indus- 
tries in the North does not have any effect 
on the steady-state number of dominant 
firms and the rate of product innovation. 

The introduction of a nontraded sector in 
the North does not affect the impact of 
protectionism on the rate of product inno- 
vation. The effects of patents on the rate of 
innovation are robust to the introduction of 
a nontraded good but may not be robust to 
the introduction of many nontraded goods. 
If a sufficiently large fraction of the econ- 
omy involves nontraded goods, then an in- 
crease in T could lead to an increase in L® 
when w > 1."! 


V. Conclusion 


We have analyzed a dynamic model of 
innovation, technology transfer, and inter- 
national trade. Although highly stylized and 
in some respects unrealistic, this model nev- 
ertheless captures some of the forces that 
are shaping the pattern of trade in the real 
world today—forces that are not easily cap- 
tured in the traditional Heckscher-Ohlin 
trade model. 

In our model, sustained product innova- 
tion in the North enables Northern workers 
to earn higher wages than comparable 
workers in the South. We have carefully 
analyzed how the rate of product innovation 
and the relative wage of Northern work- 
ers are affected not only by changes in the 
rate of technology transfer and the world 
labor endowment, but also by protectionist 
government policies. What results is an ex- 
planation for the small number of new in- 


“An appendix containing an algebraic analysis of 
the effects of patents and tariffs in the presence of 
nontraded goods is available from the first author upon 
request. 
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dustries in the North that have arisen to 
replace older, dying industries as employers 
of Northern labor. By artificially inflating 
the wages of Northern workers, protection- 
ist government policies induce sluggish in- 
novative performance in the North. 

By focusing on the steady-state equilib- 
rium, we have necessarily abstracted from 
examining the welfare implications of com- 
parative steady-state exercises. This impor- 
tant task constitutes a nontrivial extension 
of our model, and it represents a topic for 
future research. 


APPENDIX A 


Existence and Uniqueness of 
Steady-State Equilibrium 


Proving that a steady-state equilibrium 
exists reduces to showing that either (i) 
m= Z(L¥,w), m= R(L¥), and m= F(L®) 
simultaneously. intersect at some point 
(LR, M, ®)E R? for some Ñ> 1 or (ii) m= 
ZL. 1) and m= R(L®) intersect at some 
point (LR, mī) € RZ where mñ < F(L®). The 
graph of R(-} is upward sloping in LÈ, and 
the graph of F(-) is downward sloping in 
LE. Moreover, R(0)< F(O) and R(LN)> 
F(LN) =0 [given the properties of the h(-) 
function]. Consequently the functions R(-) 
and F(C) must have a unique intersec- 
tion, which we will denote (L®*, m*), (See 
Fig. 1). 

Suppose that the intersection of R(-) and 
F(-) lies above the Z{-) graph evaluated at 
w = 1[Z(L®*,1) < m*]. Then, increasing the 
wage shifts the Z(-) graph upward, and 
since lim, ,, Z(L**,w) = +, there exists 
a wage w* >J1 such that all three graphs 
intersect simultaneously at (LE* m*,w*). 
This intersection is unique, corresponds to 
the case-(i) steady-state equilibrium, and is 
illustrated by point A in Figure 1. 

Suppose that the intersection of R(-) and 
F(-) does nat lie above the Z(-) graph 
evaluated at w=1[Z(L¥*,1)> R(L8*)= 

m*]. Notice that R@)— Z(0,1)> 0 and that 
RR) Z(LR* 1) <0 [lim,;r_,, Z(L5,1) 
=—o<0<RO)=7T/h]. This guarantees 
that the Z(-) function evaluated at w=1 
and the R(-) function must intersect for 
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some ČB and m which are less than or 
equal to L®* and m*, respectively. Conse- 
quently w=1, ER and mm are steady-state 
equilibrium values satisfying case Gi). This 
steady-state equilibrium is illustrated by 
point B in Figure 2. 

The steady-state equilibrium is unique 
provided that the functions Z(L¥,1) and 
R(L®) have at most one intersection in the 
interval (0, LN]. It suffices to show that. 


=[(L—L)(.—e*?)Lh'(L) pe 
+L i e~PT)(ePM(L) 1] 
x| L224) —1)?] a 


OR(L) —Th(L) 
ôL [A(L)P 





for all L €(0,LN), From equation (5), it 


follows that 
(aa) 0> e OpLh' (L) > 1— ePhtt), 


Sabeitiiine (A2) into (A1), it suffices to 
show that 


1— eT = Th'(L) 
PENE EANA EEE > p 
L(e*®-1) [aL]? 


for all Le(0, L). Since’ h(L)>=0 and 
h'(L) <0 for all L e (0, LN), 


(A3) 


—Th'(L)  —Thk'(0) 
(A4) i MM 
[a [aE] 
Also, by equation (5) 
1—e AF 1— ener 


(AS) L( ee) —1 j 


LN( err) _ 1) 

for all L<(0,ZN). Combining equations 
(A3), (A4), and (A5) yields equation (6). 
Thus, the steady-state equilibrium is unique. 
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APPENDIX B 
Steady-State Consumer Behavior 


With time-separable utility, the represen- 
tative Northern consumer’s maximization 
problem can be solved in two stages.'* First, 
for given total expenditure at time t, E(t), 
and prices of available products, we find the 
allocation of expenditure that maximizes the 
consumer’s CDP utility function. Then we 
solve for the time path of expenditures that 
maximizes U. 

The first stage of the consumer optimiza- 
tion problem was analyzed in detail in Sec- 
tion II. The second stage involves choosing 
the optimum expenditure path E(t), te 
(0,0), The representative Northern con- 


. sumer’s assets A(t) evolve according to the 


equation”? 
(B1) A(t) =r(t) A(t) + I(t) — E(t) 


where r(t) is the instantaneous interest rate 
and I(t) is the consumer’s income at time t. 
Dots cenote time derivatives. Furthermore, 
assets and income must satisfy the feasibil- 
ity condition 


(B2) lim inf 
t{— 0 





A(t)+ fæ- fraa 
xI(r) ar20. 


This feasibility condition states that the sum 
of assets and the discounted value of in- 
come is nonnegative in the limit as £ ap- 
proaches infinity. 

Infiritely lived consumers maximize total 
lifetime utility [eq. (1)] subject to equations 
(B1) and (B2). After some algebraic manip- 
ulation, equation (8), which is the lifetime 


'?See Hal Varian (1984 p. 148). 
See Kenneth Arrow and Mordecai Kurz (1970 
Ch. 7). 
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utility function, reduces to 


(B3) U=n fe” log E(t) at 


to 


e™ Pto 1 
+ log ——— 
| p | 08 ng 





+T(J,to). 


Notice that the second, third, and fourth 
terms are all constants from the point of 
view of the consumer [who is choosing E(t). 
The third term represents the discounted 
value of all future innovations in the steady 
state. If a=1, then new products are iden- 
tical to old products, and this term disap- 
pears. 

We conjecture that, in the steady state, 
the market interest rate r(t) is constant over 
time and equal to the consumer’s discount 
parameter p. We will subsequently verify 
that, given r = p, the optimum path of con- 
sumer expenditures and assets will also be 
constant over time in the steady state. Fur- 
thermore all markets will clear at each in- 
stant in time.’4 . 

Consequently, the consumer solves the 
following optimal control problem in the 
steady state at time fy: 


oO 


(B4) max n f e™” log E(t) dt 


E(t)20;t€ [tg,0) 4 


+ constant 


subject to A(t) = Ap, 


(B5) A(t) =pA(t) + I~ E(t) 


‘alternatively, one could use (6) and. (7), allowing 
the instantaneous interest rate r(f) to vary over time. 
In this case, if one assumes that expenditure does not 
change at the steady state, then r(t)= p; that is, the 
interest rate is constant in the steady state. For more 
details, see Grossman and Helpman (1989), in particu- 
lar their equation 5, 
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and 


I 
(B6) lim int} Ac) 4 7 
{f >w 





20. 


The constraints (B5) and (B6) follow from 
(B1) and (B2), assuming that r(t) = p for all 
t > to- | 

The currert-value Hamiltonian for this 
optimal-control problem is 


(B7) H=log E(t) +A(t){p A(t) +1-E(1)} 


and the necessary conditions are 1/E(t)= 
A(t), A(t)/A(H) = 0, and equations (B5) and 
(B6). Thus, 








(B8) E(t) = Ep. 
Solving (B5), we get 
I—E,\ E-I 
(B9) A(t) = eril At ; . 
P 


If Ao +(I-— E)/p <0, then the dominant 
term in (B9) approaches —œ as ¢ ap- 
proaches +, and the feasibility condition 
(B6) is not satisfied. If A, +U — E,)/p > 0, 
then the dominant term approaches +, 
and (B6) is clearly satisfied. However, (B6) 
would still be satisfied if E} were increased 
slightly. Hence higher consumption at every 
instant in time would be feasible, and there- 
fore the expenditure path E(t)= E) would 
not be optimal. Thus, the optimal expendi- 
ture path must satisfy 








I- Eo 
and therefore 
Eo G I 
(B11) A(t)=Ay)= F 


which obviously satisfies (B6). 

To summarize, we have shown that, if the 
steady-state interest rate r(t)=, then the 
consumer optimizes by choosing a constant 
expenditure path over time Eg where 


(B12) Ey = pAgt I. 
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That is, the consumer spends his wage in- 
come and interest earning on his assets, at 
each instant in time. 
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Competition by Choice: 
The Effect of Consumer Search on Firm Location Decisions 


By Marc Dupey* 


This paper relates firm location choice and consumer search. Firms that cluster 
together attract consumers by facilitating price comparison, but clustering in- 
creases the intensity of local competition. I construct a sinple model which shows 
that firms may choose head-on competition by locating together. Under reason- 
able conditions, this is the only equilibrium outcome. (JEL 026) 


If consumers find it more convenient to 
compare the offerings of firms that are lo- 
cated close together, then firm locations can 
influence consumer search patterns. As em- 
phasized by Tibor Scitovsky (1950) and in 
the marketing literature (see, in particular, 
Paul H. Nystrom [1930] and Richard L. Nel- 
son [1958]), firms should take this into ac- 
count when choosing location. This paper 
analyzes firm location choice under the as- 
sumption that consumers are limited in their 
ability to compare prices across locations. 

My analysis applies to markets in which 
consumers visit firms to learn prices; adver- 
tising and telephone search are assumed to 
play a negligible role in transmitting price 
information. One can picture a shopper who 
is looking for a pair of shoes to match the 
color of a particular dress. It could well be 
easier for her to visit a store than to de- 
scribe what she wants over the phone, while 
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partment of Economics, Rice University, Houston, TX 
77251). This paper reflects my own views and should 
not be interpreted as reflecting the views of the Board 
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Dudey (1986). I thank Larry Benveniste, Sally Davies, 
and other colleagues at the Federal Reserve Board, 
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grateful to an anonymous referee for numerous com- 
ments and suggestions. My thanks also go te seminar 
participants at the 1988 Winter Econometric Society 
Meetings, the University of Jowa, and the Federal 
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the cost to a shoe store of supplying suff- 
ciently detailed advertising about its prod- 
uct line may be prohibitive. One might 
therefore expect shoe store locations to af- 
fect the shopper’s search pattern. She could, 
for example, lower her search costs by ob- 
taining price information from all the shoe 
stores in one shopping center before visiting 
the shoe stores at another shopping center. 
This view of the search process is obviously 
different from the random search process 
envisioned by George J. Stigler (1961). As 
noted by Motty Perry and Avi Wigderson 
(1986), the random-search approach has 
considerable appeal if consumers engage in 
telephone search, but it seems inappropri- 
ate if consumers learn prices by visiting 
firms. 

My analysis focuses on how firms choose 
location when they know how consumer 
search patterns will be affected by their 
location decisions. Consumers may be at- 
tracted to locations occupied by a relatively 
large number of firms because they expect a 
relatively high degree of competition there. 
As a result, firms may have an incentive to 
cluster together. On the other hand, to the 
extent that clustering promotes competition, 
firms have another opposing incentive to 
locate apart. In Nystrom’s (1930) words, 


Stores that sell exactly the same kinds 
of goods and that are clearly competi- 
tive do not necessarily merely divide 
the business that might have been done 
if there were but one store. Known 
competition in itself attracts trade, and 
people come from farther away... 

[pp. 137-8] 


2 
= 
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It may safely be presumed, however, 
that there is a limit to the good that 
can come to the individual store from 
the clustering of competitive shops. 
While the group secures a greater to- 
tal volume than could be secured by 
widely scattered individual stores, it is 
quite another question as to whether 
the individual stores may not suffer 


distinct losses from competition... 
[pp. 146-7] 


I develop a location-search game to study 
the tension between these incentives. The 
game is played by finitely many consumers 
and quantity-setting firms (to begin with, 
some form of limited access to the produc- 
tion or retailing technology is assumed to 
restrict entry into the industry). For simplic- 
ity, firms produce at the same constant 
marginal cost, and consumers have identical 
demand functions. 

In the game, each firm chooses a location 
before consumers decide where to shop. It 
is assumed that locations, like shopping cen- 
ters, can accommodate more than one firm; 
in the rest of the paper, I will use the term 
“shopping center” to describe a location 
that is occupied by at least one firm. A 
“shopping plan” for a consumer specifies a 
shopping center that the consumer will visit 
Gif any) for any distribution of firms across 
. shopping centers. The interpretation is that 
consumers know where firms are located 
and have enough time to visit only one 
shopping center.’ However, a consumer de- 
ciding where to shop cannot directly ob- 
serve the terms of trade available at any 
shopping center. This is modeled by assum- 
ing that firms choose quantity after con- 
sumers have decided where to shop.” At 
each shopping center, consumers buy at a 


‘Throughout most of the paper, I assume that con- 
sumers are endowed with complete information about 
where firms are located. This assumption has several 
possible interpretations. Consumers could learn firm 
locations in the course of their daily travels or from 
past shopping experiences. Alternatively, firms might 
advertise their locations. A model in which firms and 
consumers choose location simultaneously is presented 
in Section V, Part B. 

An alternative assumption—that firms choose 
quantity and consumers make shopping plans simulta- 
neously——is discussed in Section V, Part A. 
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price that equates local market demand and 
supply. 

Subgame perfect equilibrium (see Rein- 
hard Selten, 1975) is used to solve the game. 
This requires that firms play A. Augustin 
Cournot’s (1838) quantity-setting game in 
postlocation competition.* It also requires 
that each consumer’s shopping plan maxi- 
mize the consumer’s utility for any distribu- 
tion of firms across shopping centers, given 
the manner in which firms choose quantities 
and given the shopping plans of other con- 
sumers. Finally, it requires that each firm’s 
choice of location maximizes the firm’s 
profit, given the location choices of other 
firms, the consumers’ shopping plans, and 
the manner in which firms select quantities. 

The central finding is that, under some 
reasonable conditions, there is a unique, 
subgame perfect equilibrium outcome in 
which all firms locate in the same shopping 
center (although there are parameter values 
for which firms do not all locate together). 
Thus, the model can be used to explain why 
firms selling very similar or even homoge- 
neous products—for example, gas stations, 
fast-food restaurants, car dealers, and farm- 
ers selling fresh produce—often cluster to- 
gether. This explanation of why firms may 
locate together differs from the celebrated 
story told by Harold Hotelling (1929) in that 
it focuses on consumer search rather than 
transportation costs as the force driving the 
clustering phenomenon.’ It suggests that 
clustering will be more pronounced in mar- 
kets where consumers learn prices by visit- 
ing firms and not via advertising or tele- 
phone search. 


*Quantity-setting Cournot behavior is assumed for 
simplicity and concreteness. All propositions in the 
paper could be reformulated using a more general 
abstract specification of the form of postlocation com- 
petition. 

god D’Aspremont et al. (1979) explain why the 
Hotelling model is an unsatisfactory explanation of 
clustering if firms choose prices. When firms locate 
together in the middle of Hotelling’s linear market, 
price competition forces profits to zero. Consequently, 
each firm has an incentive to locate apart from its rival 
and generate local monopoly power. This is also the 
incentive firms have to locate apart in my model. It is 
the incentive firms have to locate together that makes 
the two approaches different. 
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My findings are related to recent work by 
Konrad Stahl (1981, 1982, 1987) and Asher 
Wolinsky (1983). Stahl and Wolinsky study 
models of firm location choice in which firms 
sell differentiated products and consumers 
search for a favorite brand. In these papers, 
it is argued that firms may have an incentive 
to cluster if consumers are attracted to loca- 
tions where a large variety of products is 
available. However, both authors assume 
that consumers do not expect lower prices 
at locations occupied by larger numbers of 
firms. Consequently, it is not the competi- 
tion between clustered firms that attracts 
consumers in their models. 

The paper is organized as follows. Section 
I contains a more formal description of the 
model outlined above. Section IJ presents 
the conditions that ensure a unique sub- 
game-perfect-equilibrium outcome of the 
model in which all firms locate in the same 
shopping center. It also presents examples 
which show that, for some parameter val- 
ues, firms will distribute themselves across 
more than one shopping center. Section III 
demonstrates that versions of the existence 
and uniqueness results hold if entry costs 
(instead of limited access to the production 
or retailing technology) determine the num- 
ber of firms in the industry. 

Section IV shows that versions of the 
existence and uniqueness results can be ob- 
tained if firms locate sequentially instead of 
simultaneously. Two alternative sequencing 
assumptions are discussed in Section V. The 
first is that firms choose quantities and con- 
sumers make shopping plans simultane- 
ously, and the second is that firms choose 
location and consumers decide where to 
shop simultaneously. Section VI contains 
some concluding remarks. 


I. The Basic Model 


The analysis is based on a four-stage game 
that is played by a collection of firms and 
consumers. The notation used for the basic 
model is presented in Table 1. Limited ac- 
cess to the production or retailing technol- 
ogy is assumed to fix the number of firms at 
n. In the first stage of the game, firms simul- 
taneously choose location. The firms may 
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TABLE 1—-NOTATION FOR THE Basic MopeL 


Variable Definition 

m number of consumers 

n number of firms 

fp) individual consumer demand at price p 

c constant marginal cost of production 

q@(x, y) Cournot equilibrium quantity at a location 
with x consumers and y firms 

ar(x, y) Cournot equilibrium profit at a location 


with x consumers and y firms 


share locations, and there are at least n 
locations, so any configuration of firms 
within the set of locations is permitted. In 
the second stage, m consumers with the 
same demand function learn where firms 
are located and decide if and where to 
shop. The demand function f, which maps 
nonnegative prices into nonnegative quanti- 
ties, is assumed to satisfy two conditions: (i) 
it is monotonically decreasing as long as it 
takes positive values and (ii) the area under- 
neath it is bounded. In the third stage, firms 
choose quantities. These quantities are ob- 
tained by the firms at a positive constant 
marginal cost of c. In the fourth stage, each 
shopper learns the terms of trade that are 
available at the shopping center she decided 
to visit and makes her purchases. 

Terms of trade and consumer payoffs are 
determined as follows. If, at a shopping 
center with x consumers, the firms collec- 
tively choose a total quantity that does not 
exceed xf(0), consumers make their pur- 
chases at a price that clears the market. If 
the firms collectively choose a total quantity 
that exceeds xf(0), price equals zero. A 
consumer who visits a shopping center re- 
ceives a payoff equal to the surplus she 
receives from buying at the price at which 
trade occurs less her round-trip transporta- 
tion costs (each location can serve as the 
residence of one or more consumers). Con- 
sumers who do not go shopping receive a 
payoff of zero. 

The solution concept that will be used 
here (subgame perfect equilibrium) requires 
that the profit-maximizing firms behave as 
Cournot competitors at any location attract- 
ing at least one consumer. I will assume that 
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a unique, symmetric Cournot equilibrium 
exists for any combination of firms and con- 
sumers.> Let q°(x,y) and (x,y) denote 
the Cournot equilibrium quantity and profit 
per firm at a shopping center with xz 0 
consumers and y > 1 firms. These functions 
are assumed to satisfy three standard re- 
quirements. Namely, m(x, y) is positive for 
all positive x and y, w(x, y) is monotoni- 
cally decreasing in y for any positive x, and 
yq~(x, y), the aggregate quantity supplied in 
Cournot equilibrium, is monotonically in- 
creasing in y for any positive x.° 


II. Main Results 


Since transportation costs complicate 
matters and are not needed to drive the 
clustering phenomenon, it will be both in- 
structive and convenient to examine a sim- 
plified version of the model in which con- 
sumers incur no transportation costs. The 
case of positive transportation costs is stud- 
ied in a related working paper (Dudey, 
1989a). In that paper, a metric is defined on 
the set of locations to measure distances, 
and consumers incur a constant cost k of 
traveling each unit of distance. Results that 
are very similar to Propositions 1-5 below 
are shown to hold in the case of positive 
transportation costs if the product of k and 
the maximum distance between consumers 
is not too large. 

The first aim of this section is to prove 
the central existence result. 


PROPOSITION 1: There is an equilibrium 
in which all firms locate in the same shopping 
center.’ 


For the case of one firm, the unique, symmetric 
Cournot equilibrium is the unique solution to the 
quantity-setting monopolist’s profit-maximization prob- 
lem. Sufficient conditions for the existence of a unique, 
symmetric Cournot equilibrium for any number of firms 
can be found in F. Szidarovsky and S. Yakowitz (1977), 
for example. 

These assumptions can be derived using the more 
primitive (but stronger) hypothesis that f is twice dif- 
ferentiable, f’< 0, and f(c)> 0. 

"Unless otherwise specified, “equilibrium” will refer 
to subgame perfect equilibrium. 
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The proof will make use of the following 
preliminary. 


LEMMA 1: The Cournot equilibrium price at 
any shopping center that attracts at least one 
consumer does not depend on the number of 


. consumers. In addition, the Cournot equilib- 


rium quantity and profit per firm is linearly 
homogeneous in the number of consumers. 


PROOF OF LEMMA 1: 

Suppose y firms producing at constant 
marginal cost c face x =>1 consumers. The 
unique, symmetric Cournot equilibrium 
quantity g°(x,y) is characterized by the 
inequality 

C 
„11 99 (x;y) 
(1) 7 | heces) -eaclar a) 
y = q X,Y 
rifts! I Mae 


for all q in [0,%). By (1) and the uniqueness 
of the symmetric Cournot equilibrium quan- 


tity, 
(2) xq°(1,y) =4°(x,y). 


It follows from (2) that the Cournot equilib- 
rium price, 


h 
X 


(3) fr =f [ya y)] 


is independent of x and 
(4) m(x,y)=x7(1,y). 


The practical implication of (3) is that an 
individual consumer does not need to take 
into account the search patterns of other 
consumers when she is deciding where to 
shop. She only needs to consider where 
firms are located. This result is used in the 
following argument. 


PROOF OF PROPOSITION 1: 
The proof is by construction. Let each 
firm’s strategy specify the same location s. . 
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If there are at least three firms, let con- 
sumer strategies specify one of the shopping 
centers occupied by the largest number of 
firms. If there are two firms, each consumer 
visits a shopping center; in case one of the 
two firms is at s and the other is not, all 
consumers go to s. A single firm locating 
outside s would therefore get no con- 
sumers, given that rival firms are located 
at s. 

Now, the assumptions that demand is 
monotonically decreasing as long as it takes 
positive values, m(x, y) is positive if x and y 
are positive, and yg©(x, y) is monotonically 
increasing in y for any positive x imply that 
the equilibrium price at a shopping center 
attracting at least one consumer will be 
monotonically decreasing in y, given the 
number of consumers attracted. This obser- 
vation together with (3) and the zero-trans- 
portation-cost assumption implies that a 
consumer cannot improve her payoff by de- 
viating from the strategy given above. 


The main idea behind Proposition 1 is that, 
if all firms are located together, it will not 
pay any single firm to move to another 
location, because consumers will correctly 
predict that such a firm would charge the 
monopoly price. Proposition 1 explains why 
firms selling similar or identical products 
may locate together even though the result 
will be an increase in the intensity of local 
competition.’ 

Notice that, since the location s in the 
proof of Proposition 1 is arbitrary, there are 
as many clustering equilibria as locations. 


®Obviously, such a statement cannot be made if 
consumers live apart from each other and incur posi- 
tive transportation costs. Thus, the presence of at least 
three firms is required to extend Proposition 1 to the 
case of positive transportation costs. 
Warren Skoning, formerly in charge of real estate 
for Sears’ Midwestern division, noted that 
As a general rule, we [Sears] wouldn’t want to locate 
with two other department stores with the same mer- 
chandise lines and pricing, but, on the other hand, it 
may be best to have a third store in your center than to 
float away to form the nucleus of another, competing 
center. [quoted in Milton Brown et al., 1970 p. 191) 


This statement is easily understood in the context of 
the basic model. 
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One might therefore imagine that some in- 
trinsic characteristic of s makes it a focal 
point (see Thomas Schelling, 1960 pp. 54-8) 
or that some agent, a “market organizer,” 
proposes s before the firms locate. 

The next proposition shows that, if com- 
petition between y+1 firms is not much 
more intense than competition between y 
firms for certain values of y, then clustering 
is the only equilibrium outcome of the basic 
model. Formally, the condition that ensures 
uniqueness of the clustering outcome may 
be written as 


(5S) w,y)<(n/y)r(1,y +1) 


for any y not equal to n that divides n. 
[Examples of demand functions that result 
in condition (S) being satisfied or not satis- 
fied are presented immediately after the 
proof of the next proposition.] 

The value cf condition (S) in proving that 
clustering can be the only equilibrium out- 
come is not hard to understand. Loosely 
speaking, condition (5) implies that firms do 
not lose too much by locating together. If 
firms do not lose too much by locating to- 
gether and do not take into account the 
effect of their own location choices on the 
location choices of other firms, they might 
see clustering as a way of increasing profits. 
This is because a firm that locates with a 
few of its competitors may be able to draw 
consumers (who are attracted to areas of 
high competition) away from other rivals. 
As is shown in the following proposition, 
the conclusion can be that all firms locate 
together. 


PROPOSITION 2: If condition (5) holds, 
then equilibrium requires that all firms locate 
together. 


PROOF: 

Recall from the proof of Proposition 1 
that the price at any shopping center at- 
tracting at least one consumer is monotoni- 
cally decreasing in the number of firms at 
the shopping center. This and the assump- 
tion that transportation is costless imply that 
each consumer should visit a shopping cen- 
ter with the largest number of firms to maxi- 
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mize her payoff. Given that consumers be- 
have in this manner, any firm in a shopping 
center that is not occupied by the largest 
number of firms would earn positive (in- 
stead of zero) profit by moving to a shop- 
ping center that is occupied by the largest 
number of firms. Thus, equilibrium requires 
that each shopping center be occupied by 
the same number of firms. 

Now pick an equilibrium and let y denote 
the number of firms in each shopping cen- 
ter. Observe that some firm must have no 
more than [my /n] consumers at its loca- 
tion. By the definition of equilibrium, it 
must be unprofitable for this firm to locate 
with another group of firms and attract all 
the consumers. Hence 


(6) a([my/n],y)=a(m,y +1). 
It follows from (4) and (6) that 


(7) wy) 2(n/y)r(1,y +4). 


Since y divides n, (5) and (7) imply that y 
equals n. Thus, all firms must be in the 
same shopping center.!?!! 


Given a demand function f, one can check 
whether condition (5) holds. For instance, 
this condition is satisfied if there are at least 


The proposition clearly depends on the assump- 
tion that firms use pure strategies. For example, con- 
sider a version of the basic model in which there are 
three locations, three firms that may choose location 
randomly, and three consumers that live apart from 
each other. It is easy to show that there is an equilib- 
rium in which each firm chooses each location with 
probability 1/3. However, given the admittedly restric- 
tive simultaneous-movement assumption, there is a 
good reason to exclude mixed-strategy equilibria. 
Namely, if its rivals use mixed strategies, a firm will 
have an incentive to delay its own move, which is 
inconsistent with the assumption of simultaneous 
movement. Note also that mixed-strategy equilibria 
leave subsets of firms with an incentive to share infor- 
mation about where they intend to locate, 

The result that equal numbers of firms must oc- 
cupy each firm-occupied shopping center may place 
strong restrictions on the distribution of firms across 
locations even if condition (5) does not hold; for exam- 
ple, if n is a prime number, equilibrium requires that 
firms either all locate together or all locate apart from 
each other. 
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three firms and the consumers demand 
function is linear. 


Example 1. If f~"'(q) = a— bq, then 


a-c) 1 
© ran- 


y +1) 


Condition (5) is easily verified using (8), 
under the assumption that n23. Thus, 
three or more firms facing consumers with 
linear demand will choose to compete by 
locating together. 

The next two examples explain why con- 
dition (5) is assumed in Proposition 2. The 
first of these shows that condition (5) does 
not hold for all demand functions, and the 
seconc shows that this condition never holds 
in the special case of two firms. In both 
examples, firms are able to raise industry 
profits by avoiding the clustering outcome. 


Example 2. Suppose there are six quantity- 
setting firms that produce at zero marginal 
cost and four consumers with inverse de- 
mand 


iy, _{(-a)° qeli] 
f OH q€(1,). 


In adcition, suppose three firms, A, B, and 
C, locate together in one shopping center 
and three firms, D, E, and F, locate at 
another shopping center. Consumers visit a 
shopping center with the largest number of 
firms; in case two shopping centers are oc- 
cupied by the largést number of firms, con- 
sumers a, b, d, and e visit the shopping 


‘center containing firms A, B, D, and E, 


respectively. 

The proof of Proposition 2 explains why 
the consumers’ behavior is consistent with 
equilibrium. To see that the firms’ behavior 
is consistent with equilibrium and that con- 
dition (5) does not hold, it is enough to 
check that each firm profits as much from 
being a triopolist facing two consumers as 
from being a quadropolist facing four con-. 
sumers. In fact, the symmetric Cournot 
equilibrium profit in the case of three firms 
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and two consumers is 0.01423, and the sym- 
metric Cournot equilibrium profit with four 
firms and four consumers is 0.01301 (see 
also Example 4, below). 


Example 3. If there are two firms, condition 
(5) reduces to 


(9) (1,1) <2(1,2). 


Using (4) and the definition of a(-,-), con- 
dition (9) is equivalent to 


(10) f-'(@°(1,1))a°(1, 1) — eq®(1,1) 
< f-3€q°(2,2))q°(2,2) —cq°(2,2). 


However, inequality (10) cannot hold, since 
q°(2,2) is available to the monopolist.” 
Thus, condition (5) implicitly requires the 
presence of at least three firms. 

To construct an equilibrium in which the 
two firms locate apart, suppose there are 
two consumers and choose strategies for the 
firms and consumers that satisfy the follow- 
ing properties: the firms locate apart, both 
consumers’ strategies specify the location 
occupied by both firms if the firms locate in 
the same place, and consumer a (b) locates 
with firm A (B) if the firms locate apart. By 
(3) and the zero-transportation-cost as- 
sumption, the behavior specified for con- 
sumers maximizes their payoffs. Further- 
more, the behavior specified for firms is 
consistent with equilibrium, since the equiv- 
alent conditions (9) and (10) do not hold. 


HI. Entry Costs 


In the basic model of Section I, aécess to 
the production or retailing technology was 
assumed to be limited to n firms. This sec- 
tion allows unrestricted access to the tech- 
nology, but it assumes that firms entering 
the market incur an entry cost of E>0.A 
resulting complication is that the ability of a 
given number of firms to cover entry costs 


"A referee pointed out that this is essentially the 
same “monopoly profits exceed joint duopoly profits” 
point that underlies the preemptive patenting litera- 
ture. 
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depends on how the firms are distributed 
across locations. 

The basic model is easily modified to in- 
corporate unrestricted access to the tech- 
nology and entry costs. Suppose there is a 
countable infinity of potential entrants and 
locations. Each potential entrant either de- 
cides not to enter and thereby avoids the 
positive entry cost or chooses a location and 
automatically incurs the entry cost. Con- 
sumers then decide where to shop, and en- 
trants compete for consumers at each shop- 
ping center as in the basic model. Payoffs to 
entrants and consumers are computed as in 
Section I, except that the entry cost E is 
subtracted frem each entrant’s payoff. 

In a clustering equilibrium of the unlim- 
ited-entry model, entrants must cover their 
entry costs, and potential entrants who stay 
out of the market must find it unprofitable 
to enter and locate with the clustered en- 
trants. This can be formally restated as 


(11) 


where n represents the number of entrants. 
As long as w(m,1)> E, this requirement is 
satisfied for some n, because F is positive 
and the area underneath the demand curve 
is bounded. 

If (11) holds, a clustering equilibrium will 
exist. To see this, suppose that n firms 
locate in the same place. By the proof of 
Proposition 1, consumer strategies (con- 
sistent with equilibrium) can be specified so 
that an entrant has no incentive to locate 
away from the other, clustered entrants. A 
very similar argument can be used to show 
that a potential entrant that does not enter 
the market has no incentive to locate away 
from the clustered entrants. This concludes 
the proof of the following extension of 
Proposition 1.“ 


aw(m,n)>=E>a(m,n+1) 


It should be noted that an equilibrium may not 
exist if the largest number of clustered entrants that 
can cover entry costs as a single cluster is two and if 
there are even erbitrarily small transportation costs 
(see Dudey, 1989a). The difficulty is related to the 
“two-firm problem” discussed in Example 3 and foot- 
note 8. Thus, to obtain an existence result in the case 
of positive transportation costs, it becomes necessary to 
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PROPOSITION 3: If a(m,1)> E, then 
there is an equilibrium in the unlimited-entry 
model in which all entrants locate in the same 
shopping center. 


A version of Proposition 2 can also be ob- 
tained in a straightforward manner if com- 
petition among n +1 firms is not much more 
intense than competition among 7 firms. In 
particular, suppose 


(12) w(1,n+1)>(1/2)er(1,7) 


where n satisfies (11).'* Now consider an 
arbitrary equilibrium outcome. Because of 
the cost of entry, each shopping center must 
attract at least one consumer. Hence, fol- 
lowing the proof of Proposition 2, each 
shopping center must be occupied by the 
same number of firms. In fact, similar rea- 
soning combined with (11) implies that this 
number equals n or n +1. First, suppose it 
equals n +1. Then, by (11), the firms in any 
given shopping center must attract all the 
consumers to cover their entry cost. Thus, 
there can be only one shopping center. If 
there are n firms in each of two or more 
shopping centers, some shopping center 
must be attracting no more than m /2 con- 
sumers. By (4), (11), and (12), the firms in 
that shopping center are not covering entry 
costs. It follows that there is only one shop- 
ping center. This proves the following ver- 
sion of Proposition 2. 


PROPOSITION 4: Jf condition (12) holds, 
then equilibrium in the unlimited-entry model 
requires that all entrants locate in the same 
shopping center. 


IV. Sequential Firm Location Choice?" 


Although the assumption of simultaneous 
choice of firm location is plausible for some 
markets and useful for expository purposes, 


impose a requirement implying that at least three 
clustered entrants can cover entry costs. 
Using (8), it is easy to verify that condition (12) 
can hold when consumers have linear demand. 
Jon Eaton suggested the topic of this section. 
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entry into an industry often occurs sequen- 
tially. This section considers a version of the 
basic model in which entrants locate se- 
quentially instead of simultaneously. 

Suppose there are n firms that move in 
sequence, where a move consists of a choice 
of whether to enter and (assuming entry) 
where to locate. Let the firms be indexed by 
their order of movement (firm i is the ith 
mover). As in the last section, a firm that 
elects not to enter saves a positive entry 
cost of E; any nonentrant commits its re- 
sources elsewhere and does not return. Sup- 
pose the set of locations is finite and define 
the rest of the game as in Section I. 

In the sequential-entry model, a firm may 
be able to use its own location choice to 
influence the location decisions of other 
firms. As the following example shows, this 
can guarantee that firms will avoid cluster- 
ing even when transportation costs are not 
an issue. 


Example 4. Assume there are three con- 
sumers and three firms. Suppose f~(q)= 
e~ v4 firms produce at zero marginal cost, 
and entering firms incur an entry cost of 
E = 0.01. Then, one can easily check that 


m(1,1) > 7(3,2) > 7(3,3) > E. 


Suppose each consumer visits a shopping 
center occupied by the largest number of 
firms; in case the firms locate apart from 
each other, each consumer visits a different 
firm. As noted above, this behavior is con- 
sistent with equilibrium. If the first two firms 
locate apart, the last firm will locate away 
from the first two, since (1,1) > 7@,2). Of 
course, if the first two firms locate together, 
the third firm will locate with the first two. 
Therefore, the second firm may, in effect, 
choose between locating in a shopping cen- 
ter with two other firms and attracting all 
the consumers and locating by itself and 
attracting one consumer. Since 7(3,3)< 
a(1,7), the second firm will locate away 
from the first firm if the first firm enters. It 
follows that all firms will enter and locate 
apart from each other. 

Thus, a clustering equilibrium will not 
generally exist in the sequential-entry model, 
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even when consumers incur no transporta- 
tion costs. However, according to the next 
proposition, the conditions 


(13a) w(1;1)—-7(1,2) <(2E/m) 
(13b) aw(1,y)—7(1,y+1) <(E/m) 


for y = 2,...,[m /2] rule out almost all equi- 
librium outcomes except the clustering out- 
come. Like (5) and (12), these conditions 
require that competition among y +1 firms 
not be much more intense than competition 
among y firms for certain values of y. An 
example in which the conditions are satis- 
fied is presented after the proof of the 
proposition. 

The proof fundamentally depends on the 
result that all consumers will visit a shop- 
ping center occupied by the largest number 
of firms. Except.for the special case of two 
entrants, conditions (13a) and (13b) give the 
last entrant an incentive to create a shop- 
ping center that is occupied by more ‘firms 
than any other. The entry cost ensures that 
no firm will enter unless it forecasts that it 
will be in the (only) shopping center occu- 
pied by the largest number of firms. 


PROPOSITION 5: Jf ‘wQm,1)> E, then 
there is at least one entrant in the sequential- 
entry model. If conditions (13a) and (13b) 
hold and there are at least three entrants, then 
all entrants locate in the same shopping cen- 
ter. 


PROOF: 

The existence of an equilibrium follows 
from Harold W. Kuhn’s (1953) backward 
induction algorithm..The rest of. the proof 
characterizes the set of equilibria. 

There must be at least one-entrant, since 
the last potential entrant, firm n, will cer- 
tainly enter if its predecessors do not. Firm 
n would find entry profitable since w(m, 1) 
> E. Now consider an arbitrary equilibrium 
outcome. Suppose there are several shop- 
ping centers occupied by possibly different 
numbers of firms. Since an entrant receives 
a payoff of — E if no consumers are at- 
tracted to its location, each shopping center 
must attract at least one consumer. Each 
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consumer will visit one of the shopping cen- 
ters occupied-by the largest number of firms. 
An equilibrium outcome must therefore 
place the same number of firms at each 
shopping center. 

Suppose this number is y*>2 and as- 
sume that there are two or more shopping 
centers, so that n /2 > y*. Let x,., be the 
smallest number of:consumers visiting any 
one of the shopping centers not occupied by 
the last entrant. Since the firms in each 
shopping center must.be able to cover their 
entry costs, TÈX nins Y*) = E. Using (4), this 
implies that the shopping center occupied 
by the last entrant cannot be attracting more 
than m—[£E/7(1, y*)] consumers and, 
hence, that {m -[E/ TC, y*)}}or(, y*)— E 
is an upper bound on the profit of the last 
entrant.. 

Evidently, firm n rejected its option to 
attract all consumers to a location with y* 
+1 firms. Its profit from this option would 
have been w(m, y*+1)— E. Assuming that 
firm n is not the last entrant, it must be that 
ar(m, y* +1) < E. Since the last entrant must 
cover its entry cost, 


(14) {m-[E/a(1,y¥*)]}a(1,y*) — E 
| >a(m,y*+1)-—E. 


If firm n is the last entrant, (14) holds by 
the definition of equilibrium. Rearranging 
(14) and applying (4) yields the inequality 


m(1,y*)—mw(1,y*+1)2E/m 


which contradicts (13b). A similar argument 
together with (13a) may be used to demon- 
Strate that, if there are at least three en- 
trants, y* > 2. Thus, there can be no more 
than one shopping center. 


An example in which three out of four firms 
enter and locate together is presented. be- 
low. It demonstrates that early movers may 
be at a disadvantage. | 


Example 5. Assume there are 12 consumers 
and four potential entrants. If f~'(q)= 
1—q°, firms produce at zero marginal cost, 
and entrants incur an entry cost of 1.56, 
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then it can be shown that 
(6,2) < (12,3) 
a(3,1) < (12,2) 


and that the hypothesis of Proposition 5 is 
satisfied. A simple backward induction ar- 
gument (left to the reader) can be applied 
to demonstrate that, if firm 1 enters, firms 2, 
3, and 4 may gang up to steal firm 1’s share 
of the market. In this outcome, firm 1 stays 
out of the market, and firms 2, 3, and 4 
locate together in the same shopping cen- 
ter. 

The last example of this section gives a 
basic reason for limiting the number of po- 
tential entrants in Proposition 5. The exam- 
ple shows that, if potential entry is unlim- 
ited, there may be no entry at all! 


Example 6. Assume that there are count- 
ably many potential entrants and that 


a(m,2) > E> aw(m,3). 


Assume that consumers visit the shopping 
center that was the first to become occupied 
by the largest number of firms. Now sup- 
pose that each firm uses the following rule 
to decide if and where to enter: if more 
than one firm has entered and no shopping 
center is occupied by more than one firm, 
enter and locate with the last entrant; if one 
firm has entered, enter and locate apart 
from the other entrant; do not enter if no 
other firms have entered or if at least one 
shopping center is occupied by more than 
one firm. It is easy to check that the con- 
sumer and firm behavior just described is 
consistent with equilibrium and results in all 
potential entrants staying out of the market. 


V. Models with Alternative 
Sequencing Assumptions 


Following the classical theory of spatial 
competition, the basic model of Section I 
assumes that consumers choose location af- 
ter firms have located. It departs from the 
classical theory in assuming that consumers 
choose location before prices are deter- 
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mined. In effect, the basic model postulates 
that consumers know the distribution of 
firms across shopping centers when deciding 
where to shop and that firms know local 
demand when choosing quantity. To elabo- 
rate on the role of these postulates, this 
section compares the basic model with two 
related models that involve alternative se- 
quencing assumptions. 


A. A Model with Simultaneous Choices 
of Quantities and Shopping Plans 


This subsection discusses a reasonable 
variant of the basic model in which firms 
choose quantity and consumers decide 
where to shop at the same time. In effect, 
firms are unable to observe local demand 
before making quantity decisions in the 
variant model. 

For any particular specification of m, n, 
fC), and c, any equilibrium outcome of the 
basic model is also an equilibrium outcome 
of the variant model. To see this, fix a 
subgame of the basic model that arises after 
firms have chosen locations /,, ...,/,. It has 
already been shown that equilibrium strate- 
gies of consumers may specify any shopping 
center that is occupied by the largest num- 
ber of firms. Suppose consumer j visits the 
shopping center at s(j), j=1,...,m, and let 
Ymax Equal the largest number of firms at 
any shopping center. It has also been ar- 
gued that the equilibrium strategies of firms 
require that each firm behave as a “local 
Cournot oligopolist” given the number of 
consumers that are present at the shopping 
center it occupies. 

Now consider the variant model in which 
consumers decide where to shop at the same 
time that firms choose quantity. Fix the 
subgame of this variant model that arises 
after firms have chosen the locations 4, 
...5/,- Clearly, a firm playing an equilibrium 
strategy in the variant model still behaves as 
a local Cournot oligopolist given the num- 
ber of consumers that are present at the 
shopping center it occupies. I claim that it is 
also consistent with equilibrium for con- 
sumer j to visit the shopping center at s(j), 
for all j. If, for all j, consumer j is visiting 
the shopping center at s(j) and firms are 
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behaving as local Cournot oligopolists given 
this distribution of consumers across shop- 
ping centers, any consumer is obviously 
worse off moving to a location that is not 
occupied by any firms. If any consumer 
moves to another shopping center with y* 
firms that attracts x* other consumers, she 
will find that the price she pays increases 
from f~*(¥mnax9~ Cs Ymax) to 


(ae | 


x*+1 


since firms do not adjust quantity in re- 
sponse to the move.'® It follows that any 
equilibrium outcome of the basic model is 
also an equilibrium outcome of the variant 
game. In particular, clustering remains con- 
sistent with equilibrium in the variant game. 

There are, however, additional equilib- 
rium outcomes in the variant game. For 
instance, consider the variant game with 
three firms and one consumer. It is easy to 
verify that there is an equilibrium outcome 
with the following features. Firms A and B 
locate together in shopping center 1, and 
firm C locates in shopping center 2. Firms 
A, B, and C choose quantities 0, 0, and 
q@(1, 1, respectively, and the consumer vis- 
its shopping center 2 (the consumer visits a 
shopping center with the smallest number 
of firms, and in case of a tie, the consumer 
visits the shopping center occupied by C). 
Thus, assuming that firms choose quantity 
and that consumers make shopping plans 
simultaneously may result in a larger set of 
equilibrium outcomes.”’ 

The additional equilibria rely on a coordi- 
nation problem which occurs when con- 
sumers do not visit a shopping center occu- 
pied by the largest number of firms. If they 
were given the opportunity to do so, con- 
sumers might try to disrupt such equilibria 
by notifying firms of their intention to visit a 
shopping center occupied by the largest 


©This statement is another direct consequence of 
basic model assumptions and Lemma 1. 
An example similar to this one was brought to my 
attention by Kyle Bagwell. 
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number of firms.’® In the above example, if 
the consumer could convince firms A and B 
of her intention to visit shopping center 1, 
the consumer as well as firms A and B 
would be better off. Thus, one may argue 
that the additional equilibria are unstable in 
the sense that they are not “communica- 
tion-proof.” 


B. A Model with Simultaneous Choices 
of Firm Locations and Shopping Plans 


This subsection considers a version of the 
basic model in which consumers are unin- 
formed about firm locations in the sense 
that no firm expects consumers to react to 
changes in its own location. In other words, 
firms and consumers are assumed to choose 
location simultaneously. 

The clustering outcome remains consis- 
tent with equilibrium in this simultaneous- 
location version of the basic model. Sup- 
pose that all firms and consumers locate in 
shopping center s. Clearly no individual firm 
or consumer has any incentive to move from 
s, since there are no other firms or con- 
sumers outside s. It follows that there is an 
equilibrium in which all firms and con- 
sumers locate together. 

In fact, given m, n, f(-), and c, any other 
equilibrium outcome of the basic model is 
also an equilibrium outcome of the simulta- 
neous-location model. An equilibrium out- 
come of the basic model satisfies the follow- 
ing properties: it places equal numbers of 
firms at each shopping center; each con- 
sumer visits a shopping center; and if there 
are k > 2 shopping centers, then 


(15) w(m,n/k +1) <7(x,;,,2/k) 


where Xain IS the smallest number of con- 
sumers visiting any shopping center. Now 
consider an arbitrary equilibrium outcome 
of the basic model as an outcome of the 
simultaneous-location model. No consumer 
has any reason to move to a different shop- 
ping center, because the same number of 


‘’This implicitly assumes that communication is 
possible; see Joseph Farrell (1985). 
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firms occupies each shopping center. Given 
that the consumers do not move, no firm 
would benefit from moving unless 


(16) CX pnax 0 /K +1) > (Xin /K) 


where Xmax is the largest number of con- 
sumers visiting any location in the specified 
outcome. But (4) and (15) imply that (16) 
cannot hold. 

Thus, given m, n, f(-), and c, any equi- 
librium outcome of the basic model is also 
an equilibrium outcome of the simultane- 
ous-location model. However, the following 
example demonstrates that there may be 
additional equilibrium outcomes in the si- 
multaneous-location model. Suppose that an 
equal number of firms and an equal number 
of consumers locate in each of y shopping 
centers. No consumer has an incentive to 
move to another shopping center, since each 
shopping center is occupied by the same 
number of firms. Furthermore, firms have 
nothing to gain from moving from a shop- 
ping center with n/y—1 other firms to a 
location with no consumers or to a more 
competitive shopping center with the same 
number of consumers and n /y other firms. 
Thus, there is a different equilibrium out- 
come associated with every common divisor 
of m and n. 

This multiplicity of outcomes depends, of 
course, on the assumption that no con- 
sumers can react to the location decisions of 
firms. The logic behind Proposition 2 can be 
used to show that the clustering outcome 
may still be the unique equilibrium outcome 
if some (but not necessarily all) consumers 
can react to firm locations when deciding 
where to shop. 


VI. Conclusion 


Classical models of perfect and imperfect 
competition are generally based on the as- 
sumption that consumers have complete in- 
formation about the offerings of different 
sellers. However, in reality, the transmission 
of such information is usually not costless, 
and the degree of competition between firms 
will therefore depend on what firms do to 
facilitate price comparison by consumers. 
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This appears to be what Scitovsky (1950) 
had in mind when he wrote 


I believe that the market’s perfection 
depends on the buyer’s expertness... 
[The geographical concentration of 
the experts market and the grading 
and standardization of products in 
such a market should not be consid- 
ered data, as Marshall did. They are 
the result of a deliberate effort on the 
part of producers; and I believe that 
such an effort will only be made in the 
expert’s market, in response to the 
expert buyer’s demand for easy com- 
parability. [Scitovsky, 1950 p. 49] 


The consumer-search literature stemming 
from Stigler’s seminal work removes the 
assumption of costless price comparison. 
However, almost all of this literature places 
the burden of information collection on 
consumers, while firms simply quote prices 
when approached by consumers. [An excep- 
tion is the interesting paper by Gerard But- 
ters (1977), which considers the interplay 
between informative price advertising by 
firms and consumer search.] 

My findings emphasize that the conven- 
tional approach to consumer search, which 
takes the isolation of firms as given, is a very 
partial equilibrium analysis. Of course, if 
local competition between two firms is suf- 
ficiently intense, firms may choose to locate 
apart;!9 but, if local competition is not too 
intense, firms may cluster to facilitate search 
by consumers. 


For example, if firms choose price instead of 
quantity and produce at constant marginal cost, com- 
petition between two or more firms drives profits to 
zero, Although there are multiple equilibrium out- 
comes in the zero-transportation-cost price-setting ver- 
sion of the basic model, one of the equilibria involves 
all firms locating apart (see Dudey, 1989a, b). 
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To Innovate or Not To Innovate: 
Incentives and Innovation in Hierarchies 


By JAMES DEARDEN, BARRY W. ICKES, AND LARRY SAMUELSON* 


Hierarchical organizations often perform poorly in inducing the adoption of 
innovations. We examine a principal offering contracts to agents who make 
unobservable effort and adoption-of-innovation choices {yielding moral hazard), 
who occupy jobs of differing, unobserved productivities (yielding adverse selec- 
tion), and who engage in a repeated relationship with the principal (causing a 
ratchet effect to arise). Increasing the rate of adoption of an innovation in such 
an organization causes the incentive costs of adoption to increase at an increasing 
rate. Relatively low rates of adoption may then be a response to the prohibitive 
incentive costs of higher adoption rates. (JEL 021, 110, 620) 


The Soviet Union has been chronically 
plagued by difficulties with the diffusion of 
innovation. In 1941, for example, Georgii 
Malenkov reported to the 18th Congress of 
the Communist Party that 


... highly valuable inventions and 
product improvements often lie around 
for years in the scientific research in- 
stitutes, laboratories and enterprises, 
and are not introduced into products. 

[Joseph Berliner, 1987 p. 72] 


More recently, Mikhail Gorbachev reported 
to the 27th Party Congress that 


...many scientific discoveries and im- 
portant inventions lie around for years, 
and sometimes decades, without being 
introduced into practical applications. 

[Berliner, 1987 p. 72] 


* Dearden: Department of Economics, Lehigh Uni- 
versity, Bethlehem, PA 18015; Ickes: Department of 
Economics, Pennsylvania State University, University 
Park, PA 16802; Samuelson: Department of Eco- 
nomics, University of Wisconsin, Madison, WI 53706. 
The authors thank Joseph Berliner, Eric W. Bond, 
Herb Levine, participants at seminars at Penn State, 
the University of Pittsburgh, the University of Windsor, 
the University of Pennsylvania, and an anonymous 
referee for helpful comments. Financial support from 
The American University, the National Science Foun- 
dation, and Hewlett Foundation is gratefully acknowl- 
edged, as is the hospitality of the Department of Eco- 
nomics at the University of Illinois, where the third 
author was a visiting professor during the early stages 
of this work. 


The technical achievements of the Soviet 
Union, including such inventions as the hy- 
drogen bomb, Sputnik, and antisatellite and 
space technology, have been exemplary. As 
the quotations above attest, the problem 
lies not in the technical process of invention 
but in the adoption of the resulting innova- 
tions. Why does the Soviet system consis- 
tently produce potentially valuable innova- 
tions but consistently fail to induce the use 
of these innovations? 

This paper examines innovation in hierar- 
chical organizations, focusing particularly on 
process (rather than product) innovations. 
We draw our motivation and examples from 
the Soviet Union because they are the most 
striking, but the analysis can be applied to 
general hierarchical systems. 

Our basic finding is that a key obstacle to 
the adoption of innovations in hierarchical 
organizations is not the cost of inventing or 
developing an innovation but, rather, the 


lFor example, F. M. Scherer (1984 part IID) finds 
that the rate of innovation on the part of a firm 
increases as firm size increases but does so at a de- 
creasing rate. He suggests that the relatively poor inno- 
vation performance of large firms “is in turn probably 
due to organizational problems ... although the pre- 
cise mechanism of this phenomenon is not clear” 
(p. 191). We take the distinguishing feature of a hierar- 
chical organization to be that the incentives to adopt 
innovations or take other actions must be explicitly 
constructed by a principal or planner rather than pro- 
vided by the market. A similar characterization is 
adopted by Michael Riordan (1987). 
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cost of constructing incentives to induce 
agents to adopt the innovation once it is 
available. We show that increasing the rate 
of adoption of an innovation causes these 
incentive costs to increase at an increasing 
rate. Relatively low rates of adoption, such 
as seen in the Soviet Union, may then be a 
response to the prohibitive incentive costs 
of higher adoption rates. Moreover, achiev- 
ing increased adoption rates without pro- 
hibitive cost may require not just a tinkering 
with the form of incentive contracts, but a 
modification of the hierarchical decision- 
making process.” 

The first difficulty in constructing incen- 
tives to adopt an innovation is that the 
principal generally cannot observe whether 
an innovation has been adopted, being in- 
stead able to observe only the output of an 
agent. This is especially likely to be the case 
with process innovations. Furthermore, the 
level of output is generally affected not only 
by the innovation-adoption decision, but also 
by such factors as an agent’s choice of ef- 
fort, which usually cannot be observed by 
the principal, as well as unobserved exoge- 
nous factors such as the quality of the in- 
puts, facilities, and organization with which 
an agent must work. We generally refer to 
these as simply the productivity of the job 
or enterprise that an agent fills or manages. 

A principal desiring to induce the adop- 
tion of an innovation must then solve a 
principal—agent problem with moral hazard 
(on effort and innovation adoption) and ad- 
verse selection (on job productivity)? Un- 


*A substantial literature considers microeconomic 
models of the diffusion of innovations. Our analysis 
departs from this literature in assuming that the inno- 
vation yields revenue increases that exceed direct 
adoption costs. The difficulty is that the hierarchical 
nature of the organizations forces a principal to con- 
struct costly incentives to induce agents to adopt inno- 
vations. 

3See Oliver Hart and Bengt Holmström (1987) for a 
survey of the principal—agent literature. The principal 
can often come arbitrarily close to a first-best outcome 
if arbitrarily negative payoffs could be attached to 
some outcomes (J. Mirrlees, 1974; Holmström, 1979; 
Steven Shavell, 1979). If ever such a forcing contract 
could be written, the Soviet Union appears to be the 
natural place. Even in the Soviet Union, however, 
there are limits on the penalties that can be imposed. 
One notes that in the Stalinist period, such limits may 
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fortunately, a ratchet effect appears in re- 
peated principal—agent relationships with 
moral hazard and adverse selection.’ If the 
principal is uncertain concerning the pro- 
ductivity of an enterprise or job, the princi- 
pal has an incentive to use current perfor- 
mance as a signal of that productivity. High 
current output is then followed by more 
stringent future remuneration schemes. The 
benefits from adopting an innovation thus 
tend to be “ratcheted” away. 

In particular, suppose the economy con- 
tains high-productivity and low-productivity 
jobs which cannot be distinguished. The 
principal will generally seek to induce inno- 
vation adoption and high effort, at least 
from agents in high-productivity jobs. Doing 
so will require an “innovation-adoption 
bonus” for the resulting exceptionally high 
output (denoted y,) sufficient to induce the 
agents to bear the cost of adopting the 
innovation and high effort. Suppose now 
that agents in low-productivity jobs are also 
to be induced to adopt an innovation and to 
supply high effort, yielding an output y, 
with y, < y,. An innovation-adoption bonus 
to cover these costs must now be attached 
to y,. The temptation then arises for agents 
in the high-productivity jobs to eschew inno- 
vation adoption and produce y,. These 
agents can then claim to be occupants of 
low-productivity jobs who have adopted the 
innovation and thus collect the adoption 
bonus without bearing the cost of adopting 
the innovation. Even more seriously, the 
agents in high-productivity jobs may adopt 
the innovation but then mimic low-produc- 
tivity jobs by supplying low effort. This pro- 
vides these agents with an innovation-adop- 
tion bonus, savings on effort, and a cushion 
against the ratchet effect arising from hav- 
ing increased the productivity of the job 
without revealing this increase. 


well have not arisen. Interestingly, innovation diffusion 
rates in the Soviet Union were highest in the 1930’s 
(David Dyker, 1985 p. 28), though we do not wish to 
argue that the potentially extra severe sanctions were 
the cause. 

4See, for example, Berliner (1957), A. Nove (1977), 
M. L. Weitzman (1980), Holmström (1982), M. Keren 
et al, (1983), and X. Freixas et al. (1985). 
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How is this pooling behavior to be de- 
terred and how are the agents in high- as 
well as low-productivity jobs to be induced 
to supply high effort and adopt the innova- 
tion? The return to output y, must be in- 
creased even further to make masquerading 
as a low-productivity job unprofitable. In- 
ducing agents in low-productivity jobs to 
adopt an innovation thus carries an extra 
cost related to preserving the desired inno- 
vation-adoption incentives for agents in 
high-productivity jobs. This result readily 
generalizes to organizations with more than 
two job productivities. At each step down 
the scale of job productivities, the adoption 
of an innovation can be induced only if one 
pays the direct and incentive costs of adop- 
tion to the agents in question and also pays 
the increase in incentive cost to agents in all 
higher-productivity jobs. This causes the cost 
of inducing innovation adoption to increase 
at an increasing rate as one proceeds from 
high- to low-productivity jobs. The response 
to this cost-of-adoption schedule may be to 
induce innovation adoption only in jobs of 
relatively high productivity and, hence, to 
induce relatively little use of the innovation. 

This result can be contrasted with the 
outcome of a decentralized or market econ- 
omy. In the latter, the benefits of an innova- 
tion need only exceed the direct costs of 
adoption in order to induce the agent to 
innovate. The market will then induce 
adoption levels that are efficient and that 
tend to be higher than those of the hierar- 
chical system. 

Section I motivates the analysis by provid- 
ing some evidence on the key features of 
the model for the case of the Soviet Union. 
Section II presents a two-period model. 
Section III presents an equilibrium exis- 
tence and characterization result. In Section 
IV we examine potential equilibria and es- 
tablish their properties. In the process, the 
workings of the ratchet effect are exposed. 
Our conclusions are presented in Section V. 


I. Innovation and Diffusion in the 
Soviet Union 


Our analysis rests on three stylized facts: 
that hierarchical systems perform poorly in 
inducing the use of innovations; that job 
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productivities, effort levels, and innovation- 
adoption decisions are difficult to monitor 
in a hierarchical system; and that the princi- 
pal’s inability to commit gives rise to a 
ratchet effect. We can illustrate each of 


_ these for the case of the Soviet Union. 


A growing body of research reveals that 
the Soviet system of bonus contracts is inef- 
fective in providing innovation incentives 
(Phillip Hanson, 1981 p. 64; Berliner, 1976) 
and that differences in innovation are so 
important as to be a major cause of the 
technological gap between East and West 
(Ronald Amann and Julian Cooper, 1986 p. 
12).° There is also evidence that the prob- ` 
lem lies not with the technological process 
of invention but with the failure of innova- 
tions to diffuse in the Soviet Union. For 
example, Table 1 reports the date of the 
first introduction of various innovations in 
the Soviet Union and several Western 
economies. Table 2 reports data on the 
spread of these technologies. 

Table 1 indicates that the Soviet Union’s 
record in developing advanced technologies 
is quite good. The initial dates of commer- 
cial production of the various technologies 
generally lag only slightly behind those of 
the four Western economies. Table 2, how- 
ever, reveals that the subsequent diffusion 
of these technologies into widespread use 
has proceeded at a much slower pace in the 
Soviet Union than in the West. In every 
case, a Significantly higher fraction of 1982 
output is produced by the new technology in 
the Western economies than in the Soviet 
Union. Given the Soviet tendency to con- 
centrate innovation efforts in leading enter- 
prises such as steel and nuclear power, the 
data in Tables 1 and 2, if anything, overstate 
the success of Soviet innovation attempts. 


“A striking illustration of the bonus-contract system’s 
failure to induce adequate diffusion is provided by the 
fact that some innovations have so stubbornly resisted 
diffusion as to spread into general use only after direct 
intervention on the part of the highest political leader- 
ship. Inducing the use of natural gas, for example, 
required the personal efforts of Nikita Khruschev 
(Nove, 1977 p. 187). 

°See, for example, Amann and Cooper (1982 p. 24). 
Similar diffusion experiences characterize other 
planned economies. For example, Steven Popper (1988) 
studies the diffusion of numerically controlled machine 
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TABLE 1—ADOPTION OF NEw TECHNOLOGIES! DATES OF FIRST 
COMMERCIAL PRODUCTION 


J apan 


DECEMBER 1990 





Technology USSR USA FRG UK 
Oxygen steel 1956 1954 1957 1955 1960 
Continuous cast steel - 1955 1954 1960 1954 1958 
Synthetic fiber (nylon) 1948 1938 1942 1941 1941 
High-pressure polythene 1953 1941 1954 1944 1937 
Nuclear power station 1954 1957 1966 1961 1956 
Numerically controlled i 
machine tools 1955 1957 1964 1963 1966 
Source: Amann ánd Cooper (1986 p. 12). 
TABLE 2——- SUBSEQUENT DIFFUSION OF NEW TECHNOLOGIES: PROPORTIONS 
OF OUTPUT PRODUCED BY NEw TECHNOLOGIES IN 1982 
Technology USSR USA Japan FRG UK 
Oxygen’‘steel (as percentage of 
total steel) 29.6 62.1 73.4 80.9 66.1 
Continuously cast steel 
(as percentage of total steel) 12.1 27.6 78.7 61.9 38.9 
Synthetic fiber (as percentage of 
total man-made fiber) 51.2 91.2 83.8 83.1 78.6 
‘Polymerized plastics (as 
percentage of total plastics) 46.4 87.5 80.0. 73.0 79,3 
Energy generated by nuclear 
power station (percentage of total) 7.1 12.4 17.6 17.3 16.7 
NC machine tools (as percentage of l 
total metal-cutting machine tools) 16.6 34.0 . 52.8 20.6 27.7 


Source: Amann and Cooper (1986 p. 13). 


Our analysis presumes that job productiv- 
ity as well as effort and innovation-adoption 
levels cannot be observed by the principal 
(or at least cannot be observed without ex- 
orbitant cost). While it is natural to think of 
effort as unobservable, one might conjec- 
ture that it is easy to observe whether an 
enterprise manager has adopted an innova- 
tion. However, the evidence suggests other- 
wise. In order to fulfill innovation-adoption 
targets, for example, Soviet managers fre- 
quently either adopt artificial or “pseudoin- 
novations” that represent only superficial 
changes in the process of production 


tools in Hungary. He finds much longer diffusion lags 
in Hungary then in Western economies. Neil Leary and 
Judith Thornton (1989), in a study of the Soviet steel 
industry, find not only slower diffusion rates but utiliza- 
tion rates that peak at “much lower ceilings than those 
in the market economies” (p. 65). 





(Berliner, 1976 p. 375) or claim innovation 
adoptions that are actually nonexistent: 


Where the [innovation] plans are ful- 
filled ... one would expect that the 
technological level would be satisfac- 
tory, but in fact this is not always the 
case.... This state of affairs was rec- 
ognized bv [Leonid] Brezhnev at the 
XXV Party Congress when he pointed 
out that “there are still products which 
in the reports appear as ‘new’ but in 
fact are new only by the date of pro- 
duction and not by their technical 
level.” [M. J. Berry, 1982 p. 82] 


We can also provide evidence that Soviet 
planners cannot observe effort or, more 
generally, input levels. In the late 1960’s, for 
example, managers of the Shchekino Chem- 
ical Plant were allowed to keep any cost 
savings that could be achieved by employing 
labor more efficiently and releasing excess 
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labor for other uses. The response was an 
increase in labor productivity of 52 percent 
in the first year. This experience is revealing 
both because of the extent of the labor 
hoarding or inefficient input use that per- 
sisted under the conventional monitoring 
system and because the response to sus- 
pected labor hoarding was not increased 
monitoring but revised incentives. This pre- 
sumably testifies to the difficulty of monitor- 
ing.’ 

If anything, it is not even clear that out- 
put can be observed. This is evident, for 
example, in, the existence of the “second 
economy,” where finished goods are often 
diverted from official channels by claims 
that they are “spoiled” (Gregory Grossman, 
1981 p. 76). It is also indicated by the fre- 
quency with which output reports are in- 
flated to make performance appear better 
than it is. These inflated reports go unde- 
tected by the conventional monitoring sys- 
tem but are occasionally a by ex- 
traordinary audits: 


.spot checks of 48 enterprises be- 
longing to the USSR Ministry of Con- 
struction Materials Industry revealed 
significant inflated reports at every 
‘other enterprise.... Inflated reports 

were found at 20 out of 24 plants and 
associations checked in the USSR 

Ministry of Petrochemical Indust 
[E. Manevich, 1987, pp. 84- 85] 


Recent Soviet discussions reveal that this 
problem of inflated performance reports is 
pervasive. 


‘A somewhat unusual illustration of the difficulties 
in observing the productivity of an enterprise is pro- 
vided by noting that Soviet athletes received bonuses 
for winning medals in the 1988 Olympics (Time maga- 
zine, October 3, 1988, Vol. 132, No. 14, p. 58). A basic 
scheme of descending payments for gold, silver, and 
bronze medals was established. However, an athlete’s 
bonus for winning a particular medal was then adjusted 
upward if his finish was higher than expected and 
adjusted downward if the finish was lower than ex- 
pected, Expected finish thus plays a role much like job 
productivity. It is imperfectly observed and is based on 
observations of previous performance. 

Increased Soviet attention has recently been fo- 
cused on the ubiquity of inflated Soviet performance 
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Our final presumption is that the plan- 
ner’s inability to commit-to future remuner- 
ation schemes gives rise to a ratchet effect. 
In the Shchekino experiment, the planning 
ministry committed to refrain from revising 
targets for five years. However, the initial 
gains to managers from the increased labor 
productivity were quickly dissipated as plan- 
ners reneged on this “commitment.” The 
Shchekino plant had its instructions rewrit- 
ten seven times in. ten years.-A second 
enterprise operating on the same system 
suffered 17 changes in five years (Peter Rut- 
land, 1984 p. 353). These are not isolated 
examples. A recognition of the costs of the 
ratchet motivated the reform decrees of 
1979 and the Andropov Experiment of 1983, 
which stipulated that the five-year plan was 
to take precedence over the annual plan in 
order to lengthen the period of commit- 
ment. In practice, however, the. planning 
ministries persisted in continually revising 
enterprise performance targets (Ed Hewett, 
1988 pp. 252, 264-65). The Sibtiazhmash 
Productive Association, for example, had its 
norm linking wage funds to performance 
revised four times in 1984 alone (Hewett, 
1988 p. 265). 

There is ample evidence that this inability 
to commit -gives rise to a ratchet effect. For 
example: 


The Kornevskii Silicate Brick Plant 
succeeded in 1954 in shortening the 
autoclave baking cycle to 9.8 hours, 
while the industry average was 12.4 
hours. In 1955 they set its plan at 9.7 
hours. Having run into trouble getting 
enough raw materials, the enterprise 
failed to fulfill its plan in the first 
quarter and fell among the lagging 
enterprises, even though it was pro- 
ducing more per unit of equipment 





reports. Spurred by V. Selyunin and G. Khanin (1987), 
Soviet economists have recognized that inflated perfor- 
mance reports can yield significantly overstated growth 
rates and understated inflation rates. Khanin, for ex- 
ample, estimates that Soviet national income increased 
by a factor of 660 percent between 1928 and 1985 as 
compared to the official figure of 8,900 percent. The 
controversy generated by Selyunin and Khanin is dis- 
cussed in R. E. Ericson (1988) and V. G. Treml (1988). 
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than other silicate plants which had 
fulfilled their plans. 
[Berliner, 1957 p. 78] 


More generally, Yuri Andropov reported 
that 


The business leader who has ... intro- 
duced in the enterprise a new technol- 
ogy ... not infrequently is a loser, 
while those who avoid that which is 
new lose nothing. [Hewett, 1987 p. 216] 


Ii. A Model of the Incentives to Innovate 
A. Extensive Form 


We assume that a risk-neutral principal 
(or central planner) hires or writes a con- 
tract with two or more identical risk-neutral 
agents (or enterprise managers). For conve- 
nience, we assume that the relationship be- 
tween the principal and agents lasts for two 
periods and the second period is not dis- 
counted.” 

The jobs to be performed by the agents 
(or enterprises to be managed) can be one 
of two possible types, either high or low 
productivity. Agents observe their job’s pro- 
ductivity. The principal cannot observe the 
productivity of a job and must act on the 
basis of prior beliefs. We can then let the 
parameter 6 denote the productivity of a 
job and let p, be the prior probability of 
high productivity, so that 


B = B (high productivity) 
with probability p, 


B = B (low productivity) 
with probability 1-— p,. 


We can think of nature originally indepen- 
dently choosing a value of 6 for each job 
with this value then characterizing the job 
for both periods. 


The role of the two-period limitation and possible 
extensions to longer horizons in models of this type are 
discussed briefly in Ickes and Samuelson (1987). 
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In period one, and after observing B, the 
agents choose one of two possible effort 
levels and choose whether to adopt an inno- 
vation. Let a denote the choice of effort 
and @ the innovation-adoption choice, with 
a and a denoting high and low effort, 
respectively, and with 6 and @ denoting the 
choices to adopt and not to adopt the inno- 
vation, respectively. The period-one choice 
of effort level has implications only for pe- 
riod-one output. However, if the innovation 
is adopted, it makes the job more produc- 
tive both in periods one and two. The prin- 
cipal and the agents both observe output, 
and the principal then makes payments to 
the agents. The principal cannot observe 
the agents’ choices of (a,6). The principal 
updates the principal’s prior expectation 
concerning the productivity of each agent’s 
job based upon the principal’s observations 
of period-one outputs. 

The principal now decides whether to 


‘have the agents occupy the same jobs in 


period two as in period one or to transfer 
them between jobs. If the agents perform 
the same period-two jobs as they performed 
in period one, then the accumulation of 
job-specific human capital causes period-two 
outputs to be a times the corresponding 
period-one levels, where a > 1. If the agents 
are transferred, the job-specific human capi- 
tal is lost. In period two, each job is charac- 
terized by the same basic productivity it 
carried in period one. Agents recall this or 
observe the productivity of their new job if 
they have been transferred. If the innova- 
tion was adopted in the job in period one, 
that innovation adoption continues to boost 
the job’s output. If no adoption occurred, 
no further opportunity arises. Agents then 
make effort choices, output is realized, and 
the principal makes a period-two payment 
to the agents. Table 3 summarizes the se- 
quence of events. 

The principal cannot commit to period- 
two remuneration schemes. Hence, any pe- 
riod-one announcement must include a 
period-two payment scheme that will be op- 
timal for the principal once period two has 
arrived. Equivalently. we can think of the 
principal as announcing the period-two re- 
muneration scheme at the beginning of pe- 
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TABLE 3—SEQUENCE OF EVENTS 


Period one: 


1) Nature chooses productivities of jobs (8 = £ or 8); agents observe 8 
2) Principal announces remuneration scheme for period one and announces 


whether job transfers will occur 


3) Agents choose effort levels (a=a or a) and make innovation adoption 


choices (0 = @ or @) 


4) Outputs are realized and payments to agents made. Principal updates 


expectations concerning 8 


5) Job transfers occur (a=1) if contract calls for transfers; otherwise, job 


transfers do not occur (a > 1) 


Period two: 


6) Jobs retain values of B and @ chosen in period one 

7) Principal announces remuneration scheme for period two 
8) Agents choose effort levels (a = a or @) 

9) Outputs are realized and payments to agents made 


riod two, as shown in Table 3. Given these 
assumptions about commitment possibili- 
ties, we examine a Bayesian subgame per- 
fect equilibrium.” 

The possibility that the agents may be 
transferred among jobs may appear novel. It 
has been shown previously (Ickes and 
Samuelson, 1987) that in the presence of 
the ratchet effect, optima! contracts may 
call for regularly transferring agents be- 
tween jobs. This practice of job transfers 
removes the incentive for an agent in a 
high-productivity job to disguise the job’s 
productivity in order to secure more favor- 
able future remuneration schemes. It does 
so by ensuring that future schemes, applica- 
ble to the new job into which the agent has 


We assume that the principal can make a credible 
commitment to transferring agents between jobs. Be- 
cause transfers are readily defined and verified, it is 
relatively easy to write and enforce an explicit contract 
or sustain an implicit contract to transfer employees 
from one job to another. In contrast, it is likely to be 
impossible to write and enforce contracts specifying 
future remuneration schemes. As noted in Ickes and 
Samuelson (1987), one readily finds examples in which 
employers commit to transferring employees between 
jobs, but one rarely finds explicitly specified criteria for 
evaluation of job performance and remuneration. For 
example, compare the number of academic depart- 
ments that commit to times at which tenure reviews 
will be conducted with the number that explicitly state 
criteria for tenure. 


been transferred, will not depend upon the 
productivity of the current job. 

Because agents and the jobs they occupy 
are ex ante identical, we can simplify the 
analysis by hereafter examining the rela- 
tionship between the principal and a single 
agent, referred to as “the agent.” Job. trans- 
fers in the single-agent model are equiva- 
lent to presuming that job-specific human 
capital does not appear and that, in period 
one, the agent takes the principal’s expecta- 
tions concerning the job as exogenous and 
unaffected by the agent’s actions (because 
they describe expectations in a new job to 
which the agent is transferred). We will also 
occasionally refer to the agent in a high- 
productivity job and the agent in a low-pro- 
ductivity job, but this refers to the two pos- 
sible types of the job filled by the single 
agent. 


B. Assumptions 


A function y: {8,8}<{a,a}x{6,0}-> R 
gives output levels. The function y is as- 


We assume that the principal cannot make the 
decision whether to transfer an agent to a new job 
contingent upon the agent’s output in the current job 
and carnot contract to transfer an agent into a job of a 
particular productivity. It is then convenient (and sacri- 
fices no generality) to assume a random-assignment 
rule. Allowing transfers to depend upon current output 
alters some of the details of the calculations and equi- 
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sumed to satisfy 

(1) y(B,a,6) =y 
y(B, 4,0) = y(B,ā,8)= y(B, 2,0) = y, 
y(B, a, 8) = y(B, 2,8) = y(B, a, O)= y3 


y( B ,a,8 ) = Yq. 
The intricacies of the problem arise because 
of the pooling possibilities manifest in (1). 
When y, or y} is produced, the principal 
cannot distinguish the effort level of the 
agent, the productivity of the job, or whether 
the innovation has been adopted. 

The agent derives disutility both from 
supplying effort and adopting the innova- 
tion. We can take a, a@, 0, and @ to be rea! 
numbers with a@<da@ and 0=9 <0 and then 
let the agent’s disutility be given by 


Action Disutility 
a,0 a+6 
a,6 at+é@ 
a, a 
a,9 d. 


We then assume that 8 > a@— a, so that a+ 
0 > a+8> a> a. This reveals that it is most 
costly (in utility terms) to supply high effort 
and adopt the innovation and least costly to 
do neither. Of the two possible intermedi- 
ate choices, adopting the innovation with 
low effort is more costly than supplying high 
effort without adopting. The principal must 
provide the agent with nonnegative utility, 
since the agent retains the option of non- 
participation and receiving a utility which 
we normalize to zero. In the Soviet Union, 
for example, managers retain the option of 
becoming workers (though see footnote 3). 
We assume that 


(2) Yi > Y2” Y3? V2. 


librium strategies but leaves the basic character of the 
results unchanged. 
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The first three inequalities in condition (2) 
indicate that output is higher in a high-pro- 
ductivity job than in a low-productivity job 
and that output can be increased by exert- 
ing high rather than low effort and by inno- 
vation adoption. The final inequality in (2) 
ensures that it is always profitable to em- 
ploy an agent, even if the job is low-produc- 
tivity and low effort is supplied with no 
innovation adoption. We further assume 
that 


(3) 0>(y:—Y1)> (ī—a) i=1,2,3 


(4) 6<2(a@-a). 


The first inequality in (3) indicates that the 
disutility of adopting the innovation exceeds 
the single-period gain in output. The second 
inequality indicates that it is profitable to 
induce the agents to supply high effort and 
ensures that high effort will be induced in at 
least some cases. Condition (4) implies that 
over the course of two periods it is less 
costly to raise output via innovation adop- 
tion than through high effort. Conditions (3) 
and (4) yield @ < 2(y,; — y,,,), which ensures 
that innovation adoption is efficient in a 
two-period model. 


C. Strategies and Equilibrium 


We can now describe formally the strat- 
egy spaces and payoffs of the principal 
and agent and state equilibrium conditions. 
Let P (8,0) be the principal’s period-two 
expectation that B=B and @=@, with 
P(B,0), P(f,6), and P,(B,0) being simi- 
lar. Let P,E{PER* : EP =1}=S* be 
a vector of such probabilities (S4 is the unit 
simplex in R*). The principal provides a 
remuneration scheme in each period to the 
agent which specifies the payment, denoted 
h.,, to be made to the agent in the event 
that the commonly observed outcome in 
period - is y., (¿= 1,2,3,4). The principal’s 
pure strategy set thus consists of triples 
h=(h,,t,h,) where h,&R4, t €{Yes, No}, 
and h,: S*-»R4. The payment attached in 
period one to output y; is specified as h,,; t 
identifies whether job transfers occur; and 
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h,{P,) identifies the period-two payment 
attached to output y, given expectations Pr 

Let H, be the set of functions h,: S4 — 
R4 and let H be the set of triples (hit, h3). 
Then, an agent’s strategy is a pair z,: {£, B} 
xH > (0, (a, 0),(a,0),(a, @),(@,@)} and z,: 
{8, B) x {8, 8} XH, XS? = (0, a, aj}: zB, h) 
gives the agent’s period-one ‘choice of a 
[denoted z,,(B, 4)] and 8 [denoted z,,(B, 4)! 
as a function of the productivity of the job 
occupied by the agent and the remuneration 
scheme chosen by the principal; z,(B, 0, 
h,,P,) gives the agent’s period-two choice 
of effort as a function of the productivity 
and innovation status of the job, the period- 
two remuneration scheme, and the princi- 
pal’s period-two expectation. A choice of 
zero is taken to denote nonparticipation, 
yielding a utility and an output of zero. 

If output y;; appears in period 7, then the 
principal’s payoff in that period is given by 
y,, —h,,;. Let a : (Yes, No} >[1, a], denoted 
a(t), be a function with a(Yes)=1 and 
a(No)=a, where a> 1 is the productivity 
parameter capturing the accumulation of 
job-specific human capital. Then the princi- 
pal’s objective for the game ist? 


(5) max Ep{y(B,2,(8,/)) 
—h,(y,(B,z,(B,4))) 


5 a(t)y(B, Zo(B,Z:9( Bh), ha, Pr), 


Z19(B,h)) 
—h,(y(B, 22(B 219(B, h), 42, P2), 
Z19(B,h)), P.)}. 


In the second period, the principal solves 


(6) ey E,{a(t)y(B, 228, Z:( B.A), 


ha, P2),29(B,h)) 
B h(y(B, z3(B,Zia( Bh), h2, Po), 
Z1(B,h)), P2)}- 


In keeping with our consideration of a single 
agent, payoffs or profits for the principal will always be 
taken to mean expected profits per agent. 
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TABLE 4- AGENT'S PERIOD-ONE UTILITY 
FROM VARYING CHOICES 


Agent’s action Job Outcome Utility 

a, 8 B Yı hy, ~(@+ 8) 
a,@ p Y2 ha7zā 

a,6 B Y2 Ay —(a +9) 
a,8 B Y3 hy T . 

a,6 B Yo hy. ~(a+8) 
a, 6 B A hia 

a,6 B Y3 hy, —la +8) 
a,@ B Ya ` hyva 


The agent’s payoff in period i is given by 
h{y,;) minus the disutility of the period-i 
effort and innovation adoption choice. The 
agent’s single-period utility levels from vary- 
ing choices are given in Table 4. In the 
second period, the agent in a 8 job (for 


p = B or B) thus solves 


(7) max {ha(y(B, 225 210(8,h)), Pa) 


z,€{a 
— Zo}. 


In period one, the agent in a (=$ or B) 
job solves 


(8) max - {[Ay(¥(B 210» 210)) 
21qgS1@,a 
Z19 € (0, 6} 


2, €{a,a} 
~ Zla Zial 
T [h2(y(B, 22> Z190), P2) — žal}: 


A subgame perfect equilibrium is then a 
triple of strategies (h,z,,.z,) such that the 
h, component of A maximizes (6) given 
z2; h maximizes (5) given zı and zz and 
given that h, must solve (6); z,(8) and 
z,(B) each maximizes (8) given h; zB) 
and z,(B) each maximizes (7) given h and 
given that z,(B) and z,(B) solve (8); and 
period-two posterior expectations P, are 
calculated according to Bayes’ rule. This 
last requirement warrants some elaboration. 
If job transfers are not practiced, this re- 
quires that the P, appearing in (5)-(8) be 
calculated by the principal according to 
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Bayes’ rule and that both the principal and 
agent recognize that the ‘agent’s actions, 
through their effects on y,,, affect P,. If job 
transfers are practiced, then the principal 
again calculates P, via Bayes’ rule and rec- 
ognizes that the actions the agent is induced 
to take in period one will affect P,. The 
agent’s maximization takes P, to be equal 
to the equilibrium value calculated by the 
principal but to be exogenously fixed at that 
level (because P, applies to a different 
period-two job than the one the agent now 
occupies). 


III. Equilibrium Existence and Characterization 


This section presents a basic equilibrium 
existence and characterization result: 


PROPOSITION 1: A subgame perfect equi- 
librium exists. Generically, 


1) the equilibrium is unique; 

2) the equilibrium strategies are pure; 

3) the period-one outcome is separating, in 
that period-one outputs reveal job produc- 
tivities; 

4) all agents are induced to supply high effort 
in period two; 

5) the equilibrium may or may not involve job 
‘transfers, depending upon parameter val- 
ues; 

6) innovation and high effort are induced from 
agents in B jobs in period one; 

7) depending upon parameter values, agents 
in B jobs may be induced to supply high 
effort and innovate (this may occur with or 
without job transfers), may be induced to 
supply high effort and not innovate (only 
with job transfers), or may be induced to 
supply low effort and not innovate (only 
without job transfers); 

8) the equilibrium will consist of one (and 

only one) of four sets of strategies, each 
corresponding to one of thé four possibili- 
ties identified in 7, depending upon param- 
eter values. The four sets of strategies are 
given in Table 5. The payoff to the princi- 


In order to avoid clutter, Table 5 presents only the 
actions that agents’ strategies yield along the equilib- 
rium path. Out-of-equilibrium behavior is easily calcu- 
lated from (7) and (8). 
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pal from each potential equilibrium is given 
in Table 6. 


The characteristics of the four potential 
equilibria are listed in Table 6. 

The Appendix proves this proposition. 
The intuition behind these results is readily 
provided. First, the existence and unique- 
ness of the equilibrium follows from a back-. 
ward induction argument. Second, the equi- 
librium features pure strategies, because the 
principal is in general not indifferent over 
agents’ actions. An equilibrium then cannot 
exhibit agent randomization, because the 
principal would respond by slightly increas- 
ing the payoff to the principal's preferred 
outcome and hence inducing pure strate- 
gies." Third, the principal induces separat- 
ing outcomes in period one, because the 
information gleaned from such outcomes 
allows the principal to reduce the cost of 
period-two contracts. Fourth, given such 
separation, there are no information-based 
obstacles to inducing effort choices in pe- 
riod two, and it is profitable for the princi- 
pal to induce all agents to expend high 
effort in the second period. Fifth, the pe- 
riod-one separating outcome may or may 
not be achieved with the help of job trans- 
fers, depending upon the values of parame- 
ters that determine the rate at which trans- 
fers trade reduced period-one incentive 
costs for sacrifices of human capita! accu- 
mulation. Sixth, agents in high-productivity 
jobs will always be induced to adopt and 
supply high effort, because the output gains 
from doing so exceed the utility costs (and 


“We are assuming here and throughout the analy- 
sis that when an agent is indifferent between two 
actions, the agent chooses the action most preferred by 
the principal. This presumption is required for the 
existence of an equilibrium. If the agent does not 
choose the principal’s most preferred action when the 
agent is indifferent, the principal could induce such a 
choice by adding an arbitrarily small e to the payoff 
from the desired action. As there is no smallest such £ 
which will suffice, equilibrium requires e = 0 and a tie 
which is broken in the principal’s favor. It is important 
to note that it is not entirely obvious that ties will be 
broken in the principal’s favor, especially with multiple 
agents. Ching-to Ma (1987, 1988) and Ma et al. (1988) 
examine incentive schemes that do not invoke such an 
assumption. 
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TABLE 5—EQUILIBRIUM STRATEGIES 
; Period 2 ` 
Equilibrium Strategy Period 1 P,=1 P,<1_ Transfer? 
1 Principal: hy 20+4 a — yes 
h, 80+ā a a 
h; a m a 
h4 a ii ae 
Agent: B Gi, a 
B að a 
2 Principal: h, 0+2ā-a ä — yes 
ho a a — 
h, @ = ai 
h 4 a ra a 
Agent: B @,6 a 
B 4,8 a 
3 Principal: A,  06+3a—-2a a — no 
h, 6 + a oF a a 
h; a = a 
hg a sae = 
Agent: B 4,0 a 
B 4,6 : a 
4 Principal: A, 0+8 a — no 
hz a za 
he ee pea z 
ha a iia a 
Agent: B a,0 a 


TABLE 6—SUMMARY OF POTENTIAL EQUILIBRIUM QUTCOMES AND PRINCIPAL’S EXPECTED PROFITS 


Equilibrium 


A W bo be 


$ 


Note: P, is P,(B,@). 


Period-1 Period-1 
actions (a, 0) actions (a, 0) 
and outcome and outcome Transfers 
(y)in B job - (y) in B job practiced? 
a6 yy GO yy yes 
a0 yy a9 Y; yes 
aéy, a6 y no 
a0 yy ay, no 


Principal’s expected profits: 


71 = py[2yy ~26-2a]+(i- P,)[2y2. - 6-24] 


B adopt 
innovation? 


yes 
no 
yes 
no 


73 = p,[(i+a)y,-4a-8+2a]+(1- p,) [A+ a)y,-2a-6] 
74= Pyl(1ta)y,-24—>6]+(1- piya + y3- 2a] 
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B supply 
high 
effort? 


yes 
yes 
yes 
no 
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inducing these agents to do so does not 
raise the incentive costs associated with 
other agents). Seventh; agents in low-pro- 
ductivity jobs may or may not be induced tc 
innovate, depending upon the values of pa- 
rameters that determine the relative magni-- 
tudes of the increased output and the in- 
‘crease in incentive costs associated with 
high-productivity agents. Finally, we then 
have four possible equilibria, differing ac- 
cording to whether transfers are practiced 
and whether low-productivity agents are in- 
duced to innovate. 


IV. Incentives and Innovation 


The four sets of strategies given in Table 
5 represent four possible equilibria. Be- 
cause the principal moves first in a sequen- 
tial game, we can equivalently view these as 
four possible optimal remuneration schemes 
for the principal (with the associated in- 
duced-agent actions). We begin with the 
question of whether there exist parameter 
values for which it is optimal for the princi- 
pal to offer each remuneration scheme. 


COROLLARY 1: There exist parameter val- 
ues for which each of equilibria 1—4 is the 
unique equilibrium. 


PROOF: 


We prove this by presenting four exam- 
ples. In each example, 


y,=26 p,=03 
y,=165 a=7.8 


V4= 2. 


Example 1—Remuneration scheme 1 opti-. 
mal: 


Parameters Payofts 
6=96  w,=10.62 m, = 10.2135 
@=101 m, =10.22 m, = 10:1522. 
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Example 2—Remuneration scheme 2 opti- 
mal: a 


Parameters Payofts 
6=11.7 17,=7.89 3=8.1135 
a=101 17 ,=9.59 w,=9.522. 


Example 3—Remuneration scheme 3 opti- 
mal: 


Parameters Payofts 
6=9.6  mı=10.62 w,=11.955 
a@=1.1 T= 10.22 m, = 11.522. 


Example 4—Remuneration scheme 4 opti- 


mal: 
Parameters Payoffs 
0 =11.7 7,=7.89 m}=9.855 
a= 1.1 T = 9.59 mr = 10.892.. 


The potential optimality of remuneration 
schemes 2 and 4 confirms the argument 
presented in the introduction that it may be 
unprofitable to induce the low-productivity 
agent to adopt the innovation, because of 
the increased incentive costs of inducing the 
high-productivity agent to innovate. Notice 
that transfers occur when a is low (job- 
specific human capital is not important) and 
that innovation is induced from all agents 
when @ is law (innovation is inexpensive). 
We can pursue the connection between 
parameter values and the optimal remuner- 
ation scheme further. First, we investigate 
the conditions under which the principal 


. will find it optimal to induce innovation 
adoption from all agents. 


Comparing 
schemes 1 and 2, we find that the former 
entails an extra cost of @— p,(@— a) in ex- 
change for an expected output gain of 201 — 
p(y, ~ y3). Similarly, comparing schemes 
3 and 4 reveals that the former entails an 
extra cost of (1— p,)6+ p [2 — a) + (1 -— 
pā -— a) in exchange for an extra output 


VOL. 80 NO. 5 


gain of (1 — pD + a)y,—-ay,;—y,]. We 
thus immediately have the following corol- 
lary (we can identify precise boundaries on 
the parameter values for which innovation 
adoption will occur, but the resulting ex- 
pressions are cumbersome’). 


COROLLARY 2: Inducing all agents to 
adopt the innovation is more likely to be opti- 
mal as 0 is small, y.—y, large, a large, 
Ya — y, large, and p, small. 


These results are not surprising. They indi- 
cate that inducing innovation adoption from 
low-productivity agents is more likely to be 
optimal when the cost of innovation (0) is 
small, the output gains (aly, — y,] and y; — 
y,) are large, and it is more likely that jobs 
are low-productivity (p, is small). 

The effect of variations in the marginal 
cost of inducing high effort, (@— a), on the 
optimality of inducing innovation adoption 
from low-productivity agents is ambiguous. 
If job transfers are not practiced, so that 
schemes 3 and 4 are relevant, increases in 
(a — a) make it less likely that low-produc- 
tivity-agent innovation adoption is optimal. 
This occurs because, without job transfers, 
agents in $ jobs are induced to adopt the 
innovation if and only if they also are in- 
duced to supply high effort, so that an in- 
crease in the cost of the latter makes it less 


For example, inducing innovation adoption from 
low-productivity agents is optimal if +, > max {7,74} 
or m3 > max {7,74}. After manipulation, this is equiv- 
alent to 


0< max| min (p,[a—-a@]+ (1~ p,)[2( 92 - y5)], 
pil(i—@)y,]J+(1- px) 


x [2y3— ¥4~ay3—(@~-@)]), 





min £ 
1- 


Py 
———|-2(a-a 
pop l-aa-2)] 


Haran m ⁄-@-a]). 


rs [(a~1)y,- (4-4) ]+ [C+ a) 2 ~2y5], 
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likely that inducing innovation adoption is 
optimal. In contrast, if job transfers are 
practiced, so that remuneration schemes 1 
and 2 are relevant, then increases in (a — a) 
make it more likely that innovation adop- 
tion by agents in low-productivity jobs is 


` optimal. This occurs because job transfers 


reduce ihe cost of inducing high effort. In 
the presence of job transfers, agents in the 
B job are then induced to supply high effort 


regardless of whether they adopt the inno- 


vation, znd increases in the cost of inducing 
high effort are not inimical to inducing in- 
novation adoption. 

These findings direct attention to the role 
in the model played by the agents’ effort 
choices. One might initially wonder why we 
do not strip the model of effort choices and 
concentrate on the choice of innovation 
adoption in jobs of varying productivity. Ef- 
fort choices are essential to the analysis 
because the most profitable deviation for an 
agent from a recommendation of high effort 
and innovation adoption is to adopt the 
innovation but supply low effort. Adopting 
the inncvation raises the productivity of the 
job while supplying low effort saves on ef- 
fort disutility and (most importantly) dis- 
guises the job’s productivity. This allows the 
agent ta avoid more severe future remuner- 
ation schemes while making it easier to at- 
tain high payments from existing schemes. 
The highest incentive costs accordingly arise 
in deterring agents from adopting the inno- 
vation while supplying low effort, and an 
important facet of the incentive problem is 
not captured when effort choices are ig- 
nored. 

Attention now turns to job transfers. Ickes 
and Samuelson (1987) demonstrate that job 
transfers may reduce the cost of effort in- 
centives. This is again evident in the results 
of this paper. A comparison of remunera- 
tion schemes 1 and 3, for example, reveals 
that an extra ratchet price of 2(@—a)—0 
is required to induce high effort from agents 
in B jobs i in scheme 3 (without job transfers), 
which is unnecessary in scheme 1 (with 
transfers). Because of this, job transfers are 
optimal in some circumstances. In addition, 
if an agent in a B job is not induced to 
adopt the innovation, as in schemes 2 and 4, 
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high effort is optimally induced from this 
agent only if job transfers are practiced. 
This occurs because the cost of such effort 
is prohibitive (given 6 does not adopt) with- 
out job transfers. 

We can identify conditions under which 
job transfers will occur as follows. 


COROLLARY 3: Job transfers are more 
likely to be optimal as a is small, p, is close 
to neither Q nor 1, and y, — y, and y,— 
are large. 


This follows immediately from comparing 
profit expressions for job-transfer schemes 1 
and 2 with those for remuneration schemes 
3 and 4. The results are expected. First, job 
transfers sacrifice job-specific human capital 
and, thus, are most likely to be optimal 
when the effect of such capital, or a; is 
small. Second, job transfers are designed to 
deter agents from concealing job productivi- 
ties, a problem that is most serious when 
the principal entertains significant uncer- 
tainty concerning productivity, so that p, is 
close to neither 0 nor 1. Finally, job trans- 
fers are most likely to be optimal when the 
effect of high effort, given by y, — y, and 
¥3— Y4, is large. Notice that the effect of 
effort costs is again ambiguous. As (@ — a) 
increases, transfers are more likely to be 
optimal if all agents are induced to adopt 
the innovation but less likely to be optimal 
if B agents are not induced to adopt. 

We can now make precise the statement 
that, in the hierarchical system, the cost of 
innovation adoption increases at an increas- 
ing rate. The principal in our model has the 
option of offering a remuneration scheme 
that induces the adoption of innovation from 
no agents, from agents in high-productivity 
jobs only, or from agents in all jobs. Let Cp 
be the cost of inducing innovation in no 
jobs, C, the cost of inducing innovation in 
high-productivity jobs only, and C, the cost 
of inducing innovation from all agents. 


PROPOSITION 2: Suppose job transfers are 
not practiced. Then for all p, = (0, 1), 


(9) Co=C,> C=C, 
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Alternatively, if job transfers are practiced, 
then (9) holds for all p, <[0,@ /[6 +(a@— aJ). 


PROOF: 
Consider the case of job transfers. We 
have 


Co = [p (2ā-a)+(1— p,)a] +a 
C= | p,(@+2a4—a)+(1-p,)a| +ã 


C, =| p,(26 + @) —(1- p,)(6+a)| +2. 


The bracketed term in each case gives the 
period-one cost of inducing the outcome, 
which is the sum of the remunerations 
received by a 8 and P agent multiplied by 


+ pand 1— p,, respectively. The second term 


is the period-two cost. Because these are 
separating outcomes, both agents receive @ 
in period two, and the period-two cost is 
thus p,@+(1— p,)@=4a. The terms C, and 
C, follow directly from Table 6, while C 
follows from Ickes and Samuelson (1987). 
We then calculate C, — C> C—C if p, 
e[0,0/[9 +(ā-— a). “The no-transfers case 
involves a similar calculation. 


The result shows that the cost of inducing 
agents in both types of jobs to adopt the 
innovation is more than twice that of induc- 
ing agents in only one type of job to adopt 
the innovation. This is what we refer to as 
the cost of innovation adoption increasing 
at an increasing rate. A first expectation is 
that this result will hold only if p, is not too 
large. If p, is large, there are many more B 
jobs than 8 jobs, and the incremental cost 
of inducing the relatively few agents in 8 
jobs to adopt the innovation would appear 
unlikely to be larger than the incremental 
cost_of inducing the relatively many agents 
in £ jobs to adopt. Without transfers, how- 
ever, we find C, — C > Ci — Cp for all p, < 
1. This result appears because agents in B 
jobs can be induced to adopt only if incen- 
tive costs are also paid to agents in B jobs. 
The incremental cost of inducing agents in 
B jobs to adopt then exceeds the corre- 
sponding adoption cost for agents in B jobs, 
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regardless of the relative numbers of each 
type of job. Because job transfers partially 
alleviate the incentive problem (though at 
the cost of sacrificing job-specific human 
capital), an upper bound on p, (which al- 
ways exceeds 1/2) is required for the com- 
parison with job transfers. 


V. Conclusion 


We have examined the incentives to adopt 
innovations in a hierarchical system such as 
the Soviet planned-enterprise sector. Be- 
cause the principal in such a system is gen- 
erally uncertain as to the productivity of the 
jobs filled by the agents, a ratchet effect 
appears. An agent’s exemplary performance 
is taken as a signal that the agent fills a 
high-productivity job and is accordingly fol- 
lowed by more demanding remuneration 
schemes. Agents then have an incentive to 
disguise the productivity of their jobs. 

The operation of the ratchet effect raises 
particularly severe problems in inducing the 
adoption of innovations. The principal will 
always induce innovation adoption from 
agents in high-productivity jobs and will do 
so by attaching a large payment to the rela- 
tively high output accompanying innovation. 
Suppose now that agents in lower-produc- 
tivity jobs are to be induced to adopt an 
innovation. To do so, the payments attached 
to the outputs produced if these agents in- 
novate must be increased. Unfortunately, 
this increases the return that agents in 
. higher-productivity jobs can earn by deviat- 
ing from recommended actions in order to 
disguise the productivities of their jobs. The 
payments made to these agents must then 
‘also be increased to deter such deviations. 
As. a result, each decision by the principal 
to induce innovation adoption from agents 
in jobs of agiven productivity level in- 
creases the incentive costs of inducing inno- 
vation adoption from all agents in jobs of 
higher productivity. The principal will thus 
begin by inducing innovation adoption in 
the highest-—productivity jobs and proceed 
downward, finding that at each step the 
costs of inducing innovation adoption in- 
crease at an increasing rate. The response 
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to these incentive costs is likely to entail 
inducing a relatively low rate of innovation 
adoption. 

We have derived these results in a highly 
stylized model, and we can comment on 
which features of the model are'most im- 
portant. Similar forces will appear if there 
are more than two levels of 0, a, and a, 
though the analysis is more complicated (and 
completely separating contracts are unlikely 
to be optimal or even possible if the number 
of values is too large or forms a continuum). 
The basic forces also survive generalization 
to many or infinite periods, though this gen- 
eralization appears to be most interesting if 
job productivities are continually subjected 
to random shocks so that there is always 
new information to be learned. The simplic- 
ity of the results, especially those pertaining 
to job transfers, is driven by the separation 
of the adverse selection and moral-hazard 
problems, with the former applying only to 
jobs and the latter only to agents’ actions. 
The principal faces a much more difficult 
problem if agents also differ in types. In 
some cases, however, job transfers will still 
be a useful device to reduce the incentive to 
disguise job productivities, though transfers 
will be ineffective in eliminating incentives 
to disguise worker productivities. 

We can use these results to illustrate the 
key difierence in inducing innovation adop- 
tion between a centralized system and a 
decentralized or market economy. Suppose 
that output y sells at a fixed price (which 
we can normalize to equal unity). Then an 
innovation will be adopted whenever the 
increment to output, or 2(y,;— y,;4,), €x- 
ceeds the direct cost of adopting, or 8. The 
market system would thus always induce the 
adoption of the innovations examined in 
our model. This yields an efficient level of 
investment and allows a Pareto efficient 
outcome to appear. Equivalently, we can 
say that the curve identifying the private 
cost of adopting an innovation in the vari- 
ous enterprises in the market economy is 
linear in the number of adoptions and has a 
constant slope or marginal cost equal to the 
technical cost of the innovation, Any inno- 
vation whose private benefits exceed this 
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technical cost is then adopted. In addition, 
the market economy eliminates the incen- 
tive-cost externalities examined above, so 
that private and social costs and benefits 
coincide, yielding an efficient innovation 
level. 

In a hierarchical system, in contrast, the 
corresponding cost curve increases at an 
increasing rate. The marginal cost of adopt- 
ing an innovation equals the technical cost 
of adoption only for initial adoptions. The 
marginal cost of subsequent adoptions in- 
cludes increasingly large increases in incen- 
tive costs. Given this more sharply increas- 
ing cost curve, the optimal response is the 
inducement of fewer innovation adoptions 
than in the decentralized economy. In par- 
ticular, cases will arise in which innovations 
exist whose benefits exceed the direct cost 
of adoption but which are not adopted. This 
yields an outcome with inefficiently low in- 
novation adoption. It appears as if this dif- 
ficulty is inherent in the hierarchical nature 
of the system. Mere adjustments in incen- 
tive schemes within the hierarchical system 
are unlikely to counter the problem. 

Finally, we can return to the case of the 
Soviet Union. We have seen that the opti- 
mal response to the increasing cost-of-adop- 
tion schedule that arises in a hierarchical 
system is relatively little innovation. While 
the Soviet Union exhibits all of the charac- 
teristics of a hierarchical system required to 
yield the increasing cost-of-adoption curve 
and also exhibits relatively little innovation, 
it is clear that the Soviets do not consider 
their innovation adoption rates to be opti- 
_mal. Comments such as those of Malenkov 
and Gorbachev, quoted in our introduction, 
suggest frustration with achieved perfor- 
mance. Reinforcing this, Gorbachev has 
designated “the acceleration of scientific 
and technical progress” to be “problem 
number one” for the USSR (see Amman 
and Cooper, 1986 p. 1). On the one hand, 
this perceived lack of optimality reflects the 
lack of coordination in Soviet planning and 
a resulting inability to extract the hierarchi- 
cal system’s best performance. In our view, 
however, these sentiments also represent a 
frustration with the constraints imposed by 
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the hierarchical system and a desire ‘to be 
able to operate without such constraints. 
This is reflected in the comment by Abel 
Aganbegyan, one of Gorbachev’s chief eco- 
nomic advisers, that current trends can be 
gvercome only by “revolutionary changes” 
(Aganbegyan, 1988, p. 84) and by the re- 
forms which form the heart of perestroika. 
Our findings suggest that achieving in- 
creased adoption rates without prohibitive 
cost may require not just a tinkering with 
the form of incentive contracts but a modi- 
fication of the hierarchical decision-making 
process, so that perestroika faces a formid- 
able task. 


APPENDIX 
This Appendix proves Proposition 1 via a 
series of lemmas. In many cases, the proofs 
are straightforward adaptations of previous 
proofs or arguments appearing in Ickes and 
Samuelson (1987) and are ommitted. Full 
details are available in Dearden et al. (1989). 


LEMMA 1: In equilibrium, an agent in a B 
job earns a zero payoff in period two. If the 
period-one outcome is separating, so that one 
of P(B,0), P(B,9), P{B,0), or P(B,8) 
equals unity, then the corresponding agent 
earns a zero payoff in period two. 


PROOF: 

This follows directly from the period-two 
equilibrium conditions given by (6) and (7). 
In particular, the principal will find it opti- 
mal to reduce the payments h; i=1,...,4, 
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until some agent earns a return of zero. d 


The only question concerns what type of job 
that agent occupies. If one of P (BAY, 
P (8,8), P(B,8), or P,(B,8) equals-únity, 
then the utility of an agent iv the corre- 
sponding job will be reduced to zero, giving 
the second result of Lemma 1. Suppose 
there is positive probability of either a B or 
a B job. Because the agent in a B job can 
always produce as much output and hence 
secure as much utility as an agent in a B 
job, the latter’s utility will then be the one 
reduced to zero, giving the first statement of 
Lemma 1. 
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LEMMA 2: In equilibrium, the B agent 
chooses (a, 8) in period one with probability 
one. 


PROOF: 

Suppose not. We will show that 4, can be 
adjusted so as to increase the principal’s 
profits. If (@,@) is played with positive prob- 
ability, say p, the principal can increase h1 
to hı +e for small e. This breaks the B 
agent’s indifference, causing the agent to 
play (2,6) with unitary probability. This 
gives an output gain of at least p,1— py, 
— y,) at a cost of s, which is profit-increas- 
ing for the principal for sufficiently small e. 
Suppose then that (@, 0) is played with zero 
probability and that (a,@), (@,@), and (a, 6) 
are played with probabilities p4, pz, and 
Pc, each of which is positive (the extension 
to the case in which one or more of these 
equals zero is immediate). Then the indif- 
ference needed to support this randomiza- 
tion requires hy—-a-O+x,=hy—at 
Xp = hi3 — a + xc, where x, is the expected 
period-two : return to the B agent given that 
(a, 9) is played in period one and x, and xc 
are analogous for (4,6) and (a,@). Now set 
h, so that the B agent is indifferent be- 
tween (a, 6) and (a, 6), (@,@), or (a,@). This 
requires h,,-@-0=h,-a-0+x,= 
hy - G+Xp=hy- a= xe or 


(Al) Ayw hy =a- at x 
Ay hy = (@- 4a) +6 + xe. 


This choice of h}, induces the B agent to 
choose (4,6) with probability one (other- 
wise add e to A,,). The cost to the principal 
of setting this value of h,,, from (A1) is 
then at most 


(A2) P,| p4(a@—- a+x4)+ p,(6+ Xp) 


+ pc(Z+6-a+xc)| 
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where p, is the probability of a B agent.’ 


The gain to the principal from setting this 
value of h,, is at least 


(A3) pil paly = 


+ pyl2(yy— Y2) + xp] 


yat x4] 


Y2 + xel}. 


From (3) and (4), we now see that (A3) 
exceeds (A2). This precludes the optimality 
of an outcome in which the B agent mixes 
over any of (a,0), (ā,0), and (a,@) and 
completes the proof. 


+pelyi— Yat yı- 


LEMMA 3: The B agent does not play (a, 6) 
with positive probability in period one. 


PROOF: 

The 6 agent strictly prefers playing 
(a,0) tc (a, 6), since (3,0) provides an iden- 
tical period-one payment of h,,, yields less 
disutility, and trivially allows the 8 agent to 
still reap the equilibrium period-two utility 
of zero. 


LEMMA 4: The B agent plays a pure sirate 
in period one. 


PROOF: 
Analogous to Lemma 2. 


LEMMA 5: There are five possible equilib- 
rium paths, described in Table Al where the 
period-tvo equilibria are as given in Table A2. 


PROOF: 

Lemmas 1-4 indicate that, in equilibrium, 
agents in 8 jobs must play (@,6@) in period 
one while those in B jobs must play a pure 
strategy of either @,@), (@,6),-or (a,0). 


lé since yı is infeasible for an agent in a £ job, the 
only implications of this adjustment of h,, for an agent 
in a 8 job is that it may affect P,(8,@) and hence 
the period-two remuneration scheme. Since the 8 agent 
earns zero utility in any period-two remuneration 
scheme (Lemma 1), this adjustment cannot affect the 
actions of an agent in a B job and hence does not 
affect the outcomes or costs that appear if the job is 8. 
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TABLE Al-——SUMMARY OF POTENTIAL EQUILIBRIUM PATHS 


Period-one Period-one 
actions (a, @) outcome type: 
Outcome and outcome separating (S) 
path (y) in B job or pooling (P) 
2 a0 y, S 
3 ā 0 y S 
4 aô y S 
5 ä ô y S 


TABLE A2—PeErRIop-Two EQUILIBRIA 


Period-two equilibrium 1A° 


Strategy PAB,8)< p*  P(B,6)> p* 
Principal: A, 2a—-@ a 

ħa a a 

hy, a a 

hog g — 
Agent:> z,(B, -) a (ya) ā (y) 

zB, *) a (ya) a (y3) 
Strategy Period-two equilibrium 2A 
Principal: fo; a 

ho a 

ho, 4 
Agent:? z(ßB,-) ä (yi) 

z(B,° a (y4) 

a-a 
“Where p*=1 


: a(t)(¥2 y) 
Agent’s output is given in parentheses. 


Depending upon whether job transfers oc- 
cur, this yields six period-one outcomes. 
However, it is suboptimal for the principal 
to induce (2,6) from B agents and (a,@) 
from ß agents and to practice transfers. In 
particular, agents in 8 jobs would then pro- 
duce output y,, and there would be no 
opportunity for agents in B jobs to pool. 
Accordingly, there is no reason to sacrifice 
job-specific human capital by transferring, 
and this outcome path is suboptimal. This 
leaves the five paths listed in Table A1. It 
remains to show that a unique period-two 
equilibrium, given by 1A or 2A, can be 
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Period-one 
actions (a, 6) Job 
and outcome transfers Period-two 
(y) in B job practiced? equilibrium 

a6 ya yes 1A 

a@ y, yes 2A 

ä Ë y, no 1A 

a8 ya no 2A 

Z8 yy no 2A 


associated with each path. This follows the 
analysis of Ickes and Samuelson (1987). 


LEMMA 6: Only outcome paths 1—4 consti- 
tute potential equilibria. The equilibrium asso- 
ciated with each path and the principal’s 
profits are as shown in Tables 5 and 6. 


PROOF: 

We examine the first path. The others are 
analogous. In the first case, the first-period 
outcome reveals the productivity of a job, 
with an outcome of y, (y2) indicating that 
the job has high (low) productivity. The 
period-two remuneration scheme will be 1A 
and for any job will induce high effort and 
yield the agent a period-two utility of zero. 
The complete remuneration scheme is 
shown in Table 5." It remains to show that 
the principal’s strategy is optimal (i.e., that 
it induces the desired outcome at minimum 
cost). Consider the choices of actions avail- 
able to the agents in period one and the 
resulting utilities reported in Table A3. The 
optimal action for an agent in either a high- 
or low-productivity job is clearly (@,6@), as 
desired. Given the payoffs to the alternative 
choices of (a, 0) for an agent in a 8 job and 
(ā,0) for an agent in a B job, the scheme 
also induces the desired outcomes at mini- 


WA complete specification of a remuneration scheme 
must also identify the inferences drawn by the princi- 
pal if an out-of-equilibrium first-period outcome of y, 
or y, appears. We assume here that the principal 
assumes that such an outcome reveals the job to be of 
low productivity. It is easily verified that this does not 
disrupt the equilibrium. Similar choices apply to subse- 
quent remuneration schemes, and we omit the details. 
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TABLE A3-—~AGENT’S TwWo-PERIOD UTILITIES FROM 
VARYING CHorces GIVEN REMUNERATION SCHEME 1 
Agent Action Period-one utility Pericd-two utility Total utility 
B (4,8) 0 0 0 
(a, @) a-a@a 0 —(@— a) 
(a, 9) 0 0 0 
B (ā,8) 8 0 J 
(ā,0) 8 0 6 
(a,@) 0 0 -0 
mum cost. A key step in these calculations REFERENCES 
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Intergenerational Income-Group Mobility 
and Differential Fertility 


By C. Y. Cyrus Cuu AND Hul-WEn Koo” 


One question development economists are especially interested in, but so far left 
unanswered, is: how would the societal income distribution be affected by 
introducing a family-planning program to reducz the reproduction rate of the 
poor, which is usually high in developing countries? The purpose of this paper is 
to search for analytical answers to this question. We are able to make definite 
comparisons about some class of inequality measures of the steady-state societal 
income distributions, and these comparisons provide strong theoretical support in 
favor of the above-mentioned family-planning program. (JEL 112, 841) 


In the past two decades, considerable at- 
tention has been given to the investigation 
of the relationship between population 
growth and the distribution of income. Many 
researchers have argued, based on their em- 
pirical findings, that population growth rate 
is positively related to income inequality.! 
However, Bryan Boulier (1982) and David 
Lam (1986b) have criticized these empirical 
results as too sensitive both to model speci- 
fications and to the selection of data sets. 
Moreover, because of the absence of a rig- 
orous theoretical structure, there are dif- 
ficulties both in interpreting the empirical 
evidence and in deducing persuasive policy 
implications from such evidence. 

The difficulty associated with the theoret- 
ical modeling of the relationship between 
income distribution and population growth, 


*Chu: Department of Economics, National Taiwan 
University and Institute of Economics, Academia 
Sinica; Koo: Department of Economics, National Tai- 
wan University, Taipei, Taiwan. We thank Brian Arthur 
at Stanford University for his continuous encourage- 
ment, which facilitated the completion of this paper. 
We are also indebted to Christophe Lefranc and two 
anonymous referees for their various comments and 
suggestions on an earlier version of this paper. Any 
remaining errors are our own responsibility. 

See, for example, Irma Adelman and Cynthia Taft 
Morris (1973), James Kocher (1973), W. Rich (1973), 
Calman R. Winegarden (1978), Robert Repetto (1979), 
and Rati Ram (1984). For a detailed survey of the 
literature, see David Lam (1986b). 
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as Lam (1986a) pointed out quite correctly, 
hinges upon two factors: income-specific 
differential fertility and intergenerational 
income mobility. Indeed, when a distribu- 
tion of income is referred to, it goes without 
saying that we are thinking about an econ- 
omy with various income groups. As long as 
the reproduction rates or crude fertility rates 
across income groups are different, as is 
especially obvious in most developing coun- 
tries, the property of income-specific dif- 
ferential fertility has to be embodied in the 
model. With differential fertility, population 
growth rate by definition becomes simply 
the weighted average of reproduction rates 
of all income groups, and therefore the 
compositional effects will confound the rela- 
tionship between population growth and in- 
equality. This suggests that the causal rela- 
tionship between income distribution and 
population growth as a whole is not a very 
meaningful topic to analyze, and the key 
question that needs to be addressed instead 
is the relationship between income distribu- 
tion and reproduction behavior of some 
particular income groups. 

The second difficulty associated with the 
theoretical modeling is the concern of inter- 
generational income mobility. It is clear that 
the children of all income groups have the 
possibility of upward (or downward) mobil- 
ity into other income groups. Since the soci- 
etal distribution of income for any specific 
time period is formed by the income of its 
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members, the stochastic evolution of par- 
ent-—child income transition at the family 
level has inevitably rendered any discussion 
on the evolution of income distribution at 
the society level very complicated. 

Recently, some progress has been-made 
in modeling the complex interactions be- 
tween income distribution and population 
growth. Lam (1986a) and Chu (1988, 1990) 
were able to characterize both differential 
fertility and intergenerational income mo- 
bility in a dynamic framework. The combi- 
nation of the reproduction activity within 
each income-specific family and the dy- 
namic income transition manifested in two 
generations of family members turns out to 
form a multitype Markov branching process, 
where the “type” refers to family income. 
Under certain regularity conditions, steady- 
state distribution of income and population 
growth rate have been shown to exist. Theo- 
retically, the above model can then be used 
to evaluate the policy impact of changing 
the reproduction rate of one particular in- 
come group on the various properties of the 
steady-state income distribution. 

Although the Markov branching process 
mentioned above has provided us with a 
neat structure for analyzing the dynamic 
interactions between income distribution 
and population growth, economists have so 
far been unable to derive any interesting 
policy implications from such a model. Re- 
cently, Lam (1986a p. 1110) has raised the 
following question: how would a different 
reproduction rate of the poor change vari- 
ous income-inequality measures in the 
steady state? Given the arguments of many 
economists that, in most developing coun- 
tries, income inequalities are caused by high 
population growth? and that the high popu- 
lation growth rates in most developing areas 
are due to the high reproduction rate of the 
poor,’ his question clearly has policy rami- 


*See Lam (1986a) for detailed references. 

3As pointed out by M. S. Ahluwalia (1976 p. 326), 
“the most important link between population growth 
and income inequality is provided by the fact that 
different income groups grow at different rates, with 
the lower-income groups typically experiencing a faster 
rate of natural increase.” 
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fications. Unfortunately, even the simplest 
version of Lam’s question—how would a 
different reproduction rate of the poor af- 
fect the proportion of poor in the steady 
state?——has not been answered satisfacto- 
rily.4 With even the simple question left 
unanswered, it is only logical that one hesi- 
tate before analyzing, or even posing, the 
more complicated but vital question: what 
impact will a different fertility rate of the 
poor have on the various inequality mea- 
sures of the steady state? 

The purpose of this paper is to search for 
analytical answers to the above question. 
The theoretical structure of this paper to- 
gether with some elaborated discussion of 
background motivations are presented in 
Sections I and II. Section III provides the 
main theorem of this paper and its accom- 
panying corollaries. The theorem essentially 
says that, as long as the income-specific 
reproduction rates in developing countries 
fit the stylized fact described in Ahluwalia 
(1976) and the intergenerational income- 
transition probability matrix satisfies the 
property of conditional stochastic mono- 
tonicity, which is a slight variant of the 
monotonicity conditions of G. I. Kalmykov 
(1962) and D. J. Daley (1968), then an in- 
crease in the reproduction rate of the poor 
will generate a steady-state income distribu- 
tion that is conditionally first-degree 
stochastically dominated (CFSD) by the 
Original distribution. The derived CFSD 
property implies the usual first-degree 
stochastical dominance (FSD) result of Josef 
Hadar and William R. Russell (1969), which 
enables us to compare some class of in- 
equality measures of the societal income 
distribution in the steady state. Section IV 
shows that the elasticity of a change in the 
poor’s fertility rate on the steady-state pop- 
ulation growth can be easily calculated from 
the information of the income transition 
structure. Section V discusses various exten- 
sions and modifications of the results de- 
rived in Section HI. Section VI provides a 
numerical example and demonstrates how 
the various comparative static results can be 


*See Chu (1987) for detailed explanation. 
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calculated, and the final section contains 
summaries and conclusions. 


I. Theoretical Framework 


In this section, we shall propose a theo- 
retical structure to characterize the interac- 
tion between income-specific differential 
fertility and the intergenerational transmis- 
sion of inequality. Although this interaction 
has been successfully characterized as a 
Markov branching process in Lam (1986a) 
and Chu (1987), very little discussion about 
the economic content behind the mathemat- 
ical structure has been given. In what fol- 
lows, we shall briefly present a household- 
decision framework in which parents’ saving 
and fertility decisions are made endoge- 
nously. Furthermore, we will demonstrate 
how these family decisions are related to 
the Markov branching processes in ques- 
tion. It is believed that the present brief 
introduction will be helpful to the under- 
standing of the insights of our later presen- 
tation. 

Let us consider the usual one-sex over- 
lapping-generation framework with altruis- 
tic parents. Each individual lives two peri- 
ods, young and old. At the beginning of 
people’s old period, they receive bequests 
or other forms of endowments from their 
parents and from their own families. How a 
family head’s income is related to the en- 
dowment he receives from his parent will be 
discussed below. We shall first discuss the 
decisions he is going to make given his 
family income y,. The first decision a family 
head has to make is the number of children 
he wants to bear (F). After this is done, he 
has to divide his family income into family 
consumption (c,) and savings s,=y,—¢,, 
which are further divided among children, 
either as human capital investment funds or 
as bequest. 

Since each person may have different 
ability or luck, the relationship between a 


>More details about the connection between the 
micro-level household decision and the macro-level 
branching process of the society can be found in Chu 
(1990). 
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child’s earned income, denoted y,,,, and 
his endowed capital (or bequests), denoted 
s,/F, is not definite. Following Glenn 
Loury (1981), suppose we let a production 
function f(-,-) summarize the interactions 
between a child’s endowed capital, his luck 
or ability (denoted a,), and his earned in- 
come (y,,,): 


(1) VYa1=f(s,/F,a,). 


In general, parents’ optimal savings and 
optimal fertility are both dependent upon 
their family income, and hence we can 
rewrite (1) as 


(2) Y1 = f(s) /E (y) a) 
= f(y, a) 


where s(-) and F(-) respectively represent 
the optimal saving and fertility functions. 
Suppose a, is independently identically dis- 
tributed for all individuals in all periods and 
that df /da,>0 (the marginal productivity 
of “luck” is always positive); then, 


(3) Pr(y,4,<yly,=x) 
= Pr( f(y a) < yly, =x) 
= Pr(a < f'(x,y)) 
= U(y,x) 


where f ! is the inverse function of 
ff: f(x, f x,y) = y. &C,:) in B) charac- 
terizes the cumulative transition probability 
functicn between y, and y,,,. Suppose the 
state space is discrete and there are n in- 
come classes in the economy with incomes 
yi<y*<--- <y"; then, we can relate the 
transition mass function M; = M(,j) de- 
fined in Lam (1986a), which represents the 
probability that a child of income class j 


°Here we do not consider the complication that 
parents may divide their bequests unevenly among 
children, either to compensate or to reinforce the 
ability differences of children (see e.g., Eytan Sheshin- 
ski and Yoram Weiss, 1982). 
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becomes a member of class i, with our 
cumulative transition function .#(-,:): - 


Pr( yY, < y'ly, a y’) = M(y', y’) 


What is demonstrated above is a micro 
foundation for the Markov branching pro- 
cess given in Lam (1986a). From (2) and (3), 
we see that there are two implicit elements 
in the specification of intergenerational 
transmission of income distribution: par- 
ents’ optimal saving function s(-) and their 
optimal fertility function F(-). The former 
contains parents’ implicit trade-offs be- 
tween present family consumption and be- 
quests or investments on children, and the 
latter characterizes a balance between par- 
ents’ marginal benefit and marginal cost of 
childbearing. Factors that affect these two 
decisions will also affect the income transi- 
tion structure and, hence, ‘the steady-state 
income distribution. It should be noticed 
that parents’ fertility decision and consump- 
tion/saving decision usually interact with 
each other. For example, if a parent is con- 
sidering whether to have an additional child, 
he will note that this child will have a capi- 
tal-dilution effect on per-child bequests [see 
(2)], and hence he may also want to con- 
sider a change in his saving decision. Thus, 
a policy that induces parents to reduce fer- 
tility sometimes will also cause a change in 
the income transition structure in (3). 

In the following section, we show that the 
basic dynamic income transition structure 
characterized in (3) will eventually generate 
a steady state of income distribution, which 
is the main target of our later policy analy- 
sis. Various extensions and complications 
are discussed in Section V. 


II. The Steady State of Income Distribution 
Following the notation often adopted in 


the literature, denote the size of the n in- 
come groups in period ¢ by P/ = 
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[Py gs Payee Pj and the income-specific 
net reproduction rates by a diagonal matrix 
F. Taking each period as a generation, Lam 
(1986a) showed that the evolution of in- 
come-specific population sizes would be 
characterized by the following identity:7 


(4) P, = MEP, _ 


where the mobility matrix M is defined in 
Section J. Let the proportion of the popula- 
tion in income group į at time ¢ be 7;,, 
Dividing both sides of (4) by N, = 

i=1/; 1-1 the total population size at time 
t — 1, yields 


(5) ME ar,_, = MP(P, -1 /N,-1) 
= P/N 
= (P, /N,) 8, = m8; 
or explicitly® 
(5°) LM ijj T Wie Be 


where g,=(N,/N,_,) is the population 
iia rate at period t, and w/=(|7,/,, 
Wo 55+ |. Summing both sides of 6’) 
over i and using the property LM; ;=1= 
LiT; p We get 


(6) 8: 7 L Eti pai 
j 


In the steady state, (5) will become MFrr* 
= m*g*, where w* and g* denote the re- 
spective variables in the steady state. 


Let T be defined as FM’; then 7;, repre- 


sents the expected number of j-group chil- 


“Equation (4) characterizes a Markov branching 
process. It should be noticed that the state variable of 
this branching process is not income but, rather, a 
point distribution. A typical point distribution at time t 
is written as Z,=(y1, Pig Yos Pane--3 Yao Paed Which 
means that there are P, people with income y,, i= 
1,...,4 at time ¢. As f->0, the state variable Z, will 
not converge by itself; but the proportion of people in 
various income classes will converge. See Theodore 
Harris (1963 Ch. HD for detailed explanation. 

“To simplify our notations, the summation sign in 
this paper always ranges from 1 to n unless otherwise 
specified. 
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dren born to an i-group parent. On the 
assumption that the offspring of all income 
groups have positive probability of joining 
other income groups, and assuming that 
F, > Q is finite for all i, the matrix T satisfies 
the requirements of positive regularity. The 
following theorem proved by Charles Mode 
(1970 pp. 14, 30; see also Samuel Karlin and 
Howard Taylor, 1975 p. 546) is essential to 
our later discussion.” 


THEOREM 1: Jf T is positively regular, 
then T has a unique positive dominant eigen- 
value g. Corresponding to g, there are right 
and left eigenvectors p'=(w,,...,@,) and 
v=(y,,...,¥,), having strictly positive ele- 
ments and with the properties g v = vT, Tp = 
gp, and vp =1. If g >1, the frequency distri- 
bution 7, , given in (S') will converge to më 
=p; /@, + +++ +v,) Vi almost surely. Fur- 
thermore, if we set T,= uv = (u;v;), it fol- 
lows that T”/ g” > T,. 


Mode’s theorem tells us that as long as T is 
positively regular, starting with any arbitrary 
vectOr To =[1,9,---5%,,9) the iteration of 
equation (5) will generate the steady-state 
ar*. This result will be particularly useful for 
our analysis in the following sections. It is 
also shown by Mode (1970) that if g <1, 
then the population will become extinct with 
probability one. Thus, the case g >1 is the 
only interesting case to discuss here.” In 
what follows we study how the steady-state 
distribution m* will change as the poor’s 
reproduction rate changes. 


Ill. Distributional Impact of Changing 
Income-Specific Reproduction Rates 


The question Lam (1986a) addressed but 
failed to answer was: what is the sign of 
Om7* /OF,? With the sign of dr /oF, un- 
known, it does seem extremely difficult to 
analyze the impact of a changing F, on the 
various steady-state inequality measures, 
which are usually nonlinear functions of all 


?The part of Mode’s theorem that is irrelevant to 
our later presentation is not repeated here. 

See Chu (1988) for detailed discussion about the 
meaning of g>1. 


CHU AND KOO: INCOME-GROUP MOBILITY 1129 


elements of the steady-state w*. This dif- 
ficulty has prompted us to search for other 
ways to tackle the problem. In a seminal 
paper, Anthony Atkinson (1970) pointed out 
that, instead of comparing various inequal- 
ity measures with different and specific un- 
derlying concepts of social welfare, it is more 
appropriate to apply the notion of stochas- 
tic dominance and compare the distribution 
of income directly, with minimum restric- 
tions on the properties of social welfare 
function. This is basically what we intend to 
do below. 

As tn Section II, let us, without loss of 
generality, order the indexes (subscripts) 
of all income groups in such a way that 
Yi SY °°: Sy, We shall put forth two 
assumptions. 


ASSUMPTION 1: 
Fiz F eF ; 


ASSUMPTION 2: 
Eim Mi 


pramana 


E 
vi 1Miz 


A M; => Eiai Mp E 


j=l 


Li-1Mip 
ag 


1l<i<Je<n. 


The first assumption characterizes a stylized 
fact in developing countries referred to by 
Ahluwalia (1976 p. 326) that “the lower 
income groups typically experienc[e] a faster 
natural rate of increase.” The second as- 
sumption requires that the mobility matrix 
obeys the property of conditional stochastic 
monotonicity (CSM), which is a variant of 
Kalmykov’s (1962) condition of stochastic 
monotonicity (SM): 


I I I 
(SM) } Maz } Mp2 = } Mn 


j=] i=] i=1 
l<Jl<n. 


Notice that CSM implies CM but not vice 
versa, and hence CSM is a stronger assump- 
tion. In our context, Assumption 2 means 
that if a poor kid and a rich kid both fall 
into the poorest J classes, it is more likely 
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that the poor kid will be poorer than the 
rich kid, which seems to be an intuitively 
appéaling statement. More discussion about 
the comparison between Assumption 2 and 
Kalmykov’s SM condition will be left to 
Section V. 

Now we shall introduce the main theorem 
of this paper. Suppose M and F are the 
original mobility and fertility matrices, and 
without loss of generality suppose at period 
zero the steady state associated with M and 
F has been reached with aw, the steady-state 
income distribution. Let us consider a policy 
experiment that increases F} by ô. After 
such an increase in F,, w%, obviously will no 
longer be the steady state, and a sequence 
of distribution vectors mi, Ta,..., Tp... Will 
evolve according to equation (5). The fol- 
lowing theorem can be established: 


THEOREM 2: if Assumptions 1 and 2 hold, 
then 


I I 
EA Tir Lie 17 io 


7 
E 


Sem J 
Ti Ejeo 


1<I<J<n- Yt. 


The proof is given in the Appendix. 
When we set J =n in the denominator of 
(7), we have 


l<I<n Vi 


I I 
(8) È raz Li To 


i=] b=] 


which is the conventional first-degree 
stochastic dominance (FSD) relation which 
was found by Jean-Pierre Danthine and 
John Donaldson (1981) while comparing 
distributions of two Markov (not branching) 
processes, Thus our conditional first-degree 
stochastic dominance (CFSD) result in (7) is 
stronger than the previous unconditional 
FSD result. In order to make comparisons 
with the existing literature, in what follows 
we will concentrate on the implications of 
the FSD result in (8). 

When we let ¢ go to infinity i in (8), m, on 
the left-hand side will converge to a new 
steady-state distribution (denoted mą), as 
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asserted in Theorem 1. Thus, we have 


from which many interesting economic im- 
plications can be derived. First, by setting 
I =1 in (9), we obtain 


Tix = Tip. 


When ô (the introduced fertility difference 
between regimes * and 0) is infinitesimal, 
we have the usual comparative static result: 


8 = 

ue oe tm — tt 8 og 

ð $8 0(F,+5)-F, 

This provides an answer to the question 
raised by Lam (1986a p. 1109): under what 
conditions can we determine the sign of 
dar, /OF,? From the analysis of Chu (1987), 
it is clear that the conditions that enable us 
to answer Lam’s question must include in- 
formation on all the terms of matrices M 
and F. The question then becomes one of 
whether the conditions we propose are rea- 
sonable and whether they contain clear eco- 
nomic interpretations. From the discussions 
presented at the beginning of this section, 
we believe that Assumptions 1 and.2 satisfy 
these two criteria. 

Another similar property that can be de- 
rived from (9) concerns the comparative 
statics of the proportion rich in the steady 
state. Since L7_47;, =1= LFT, by set- 
ting [= n —1 in (9) we have 


Ty x S Tno- 


Thus, reducing the poor’s reproduction rate 
will definitely increase the steady-state pro- 
portion rich. - 

Furthermore, the FSD result in (9) also 


allows us to analyze the welfare impact of a 


changing F,. For the class of Benthamite 
social welfare functions 


W, = DUC yi) Tix 
J 
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with monotonically increasing U(-), it has 
long been understood (see Hadar and Rus- 
sell, 1969 theorem 1) that mw, exhibits FSD 
to mą implies 


(10) CU(y,)mjo=Wo> We 
j 


= dU y;) Tja 
j 


Therefore, one can conclude that, if As- 
sumptions 1 and 2 are satisfied, then a re- 
duction in F, can increase social welfare for 
a very large class of social welfare functions. 
As pointed out by Atkinson (1970), the 
above-mentioned direct ranking of income 
distribution, which does not rely on detailed 
functional specifications of social welfare, is 
a better alternative to conventional compar- 
isons of various (perhaps conflicting) in- 
equality measures. 

- Finally, if we set U(X)= X VX in the 
definition W,, then one particular implica- 
tion of (10) will be that the mean of the 
steady-state income distribution will also be 
increased by a reduction in F}. 

In summary, the above discussion shows 
that the high reproduction rate of the low- 
income group has a negative effect on all 
the distributional measures that can be ex- 
plicitly worked out. These findings are strong 
theoretical support for family-planning pro- 
grams that advocate a lower reproduction 
rate of the poor in developing countries. 


IV. The Impact of Changing Income-Specific 
Reproduction Rate on Steady-State 
Population Growth 


Having analyzed the distributional impact 
of a changing F,, we now study how the 
steady-state population growth rate will be 
affected by changes in F,. For demonstra- 
tion purpose, let us designate the reproduc- 
tion rate of the poor as F, = F¥ + 6 and let 
6 characterize the change in F,. For any 
given ô, the dynamic transition rule in (5°) 
can be rewritten as 


(11) g,( 5); (8) = 2 Ti) 77;,.-1(8) 


where T; = F M; as demonstrated in Sec- 
tion II, aid where we let the 8’s that follow 
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all variables remind us that the dynamic 
system (11) is affected by the variable 6. 
Differentiating (11) with respect to 6 and 
evaluating the result at 6 = 0 yields 


dm; (0) 


(12) g,{0) Wa 


a 





=- Ti Ti (0) 


T, (0) . 
+ 


Fe eg ME i- :(0) 


ui a 
Len ZO 


Equation (10) characterizes the comparative 
dynamics of an infinitesimal change of F) at 
the point F;*. Suppose that at period zero 
ô= 0 and that the steady state correspond- 
ing to 6 = 0 (or F, = F*) has been achieved; 
then, by the definition of the steady state, 
we have a, (0)=7* Wt and g(0)= g* Vt. 
Thus, equation (12) can be written as 








(13) „dTie sa) ag ee d oe 
£ dF, Te dF, F, 1 
| dtr. 
j,t—1 
HET, dF, 
J 


where we drop the “(Q)” in each term to 
simplify our later presentation. One can it- 
eratively lag (13) one period and substitute 
the lagged result in the last term on the 
right-hand side of (13) to obtain! 


dT, dt; 
ao r| "H ) 





i+ mY 
= — qr: 


T 
A i a 
i dF, F, (ey 
j y T; (d Tjo /4F,) 
j Cai 
T; (dir; o/adF;) 
(g = I 





-E 


!'Technical detail is available from the authors upon 
request. 
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where Ti*'=2,7)T;;. The almost-sure 
convergence property of the dynamic system 
characterized in Theorem 1 tells us that g 
and mw will eventually converge to a new 
steady state, and therefore the changes in g 
and m will also converge to constants: 
dg, /dF, > dg*/dF, and dt,,/dF, > 
di;* /dF,. Furthermore, Theorem 1 tells us 
that as t 0, T; > (g*) Wp which implies 
that the last two terms of (14) cancel each 
other out. With the above information, (14) 
can be further simplified as 


(15) 0=- i + ini 


Because mř =v;/Ł;v; Wi by Theorem i; 
(15) can be rearranged to 


(16) P \(3)- Spee: 





dF 


It is worth noting from the above derivation 
that the validity of (16) is independent of 
Assumptions 1 and 2 and the source of the 
income-specific fertility change; hence, what 
we actually prove is the following. 


THEOREM 3: 
(dg*/dF,)-(F,/8*) = 4»; > 0 


f=1....,n. 


Theorem 3 is an easy formula that enables 
us to predict the steady-state change of 
population growth as a result of changes in 
the reproduction rate of any income group. 
Furthermore, since L u,v, = 1-by Theorem 
1, it is clear that u,v,;<1 for 1=1,...,n. 
Thus, Theorem 3 signifies that a reduction 
in F, will entail a corresponding decrease in 
the steady-state population growth rate, but 
the elasticity of changing is less than one. 


V. Discussion and Extensions 
A. Comparative Dynamics 
What is emphasized in Section IJI is that 


a decrease in F, will make the steady-state 
income distribution exhibit CFSD to the 
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original distribution. However, this in itself 
is not sufficient for us to argue for the 
soundness of the policy of reducing F, 
without knowing how income distributions 
would change in transition periods. The an- 
swer to the latter is provided in (7), in which 
we see that as F, decreases, tr, would al- 
ways exhibit CFSD to a, in all transition 
periods until the steady state is reached. 
This completes the comparative dynamics of 
our analysis and further strengthens our 
confidence in parry pinning programs for 
the poor. 


B. Family Size an Intergenerational Mobility 


In Lam (1986a) as well as in our discus- 
sions in Section III, it is assumed that a 
reduction. in F will not give rise to any 
changes in the terms of the mobility matrix. 
However, many studies have also found a 
persistantly negative relationship between 
family size and child achievements in gen- 
eral. In particular, it has been argued 
that, with a high F,, a poor family would 
have less to spend ‘in per capita human 
capital investment on children, which would 
negatively affect the upward mobility of 
these children. Thus, as F, changes, we also 
expect changes in the first column of M, 
which embodies the information of the up- 
ward mobility of the poor. 

Denote (F*,M“) and (F®,MB8) respec- 
tively as the F and M matrices before and 
after the change of F,. The change from 
(F4,M“) to (F®,M®) can be explored in 
two different stages: £i) (E^, M^) to 
(F®, MA“) and (ii) (F8, Mô) to (F®,M®). The 
first stage, which involves a changing F but 
with the mobility matrix unchanged, has al- 
ready been analyzed in Section HI, and now 
it is the second-stage change that we exam- 
ine. Given the definite effect of changes in 
F, on the first column of M, denoted’ M, it 
is pertinent to ask how such effect is going 
to take place. Let Mj=[M,,,M),...,M,,]. 
Suppose that an increase in F, will make 
M, change to M,. We assume that the M, 
vector satisfies the following. 


12 For a general survey of studies on this topic, see 
Elizabeth M. King (1986). 
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ASSUMPTION 3: 


i I Af 
Lin Min Li Ma 


— a l<Il<gJ<a. 
My  Ej=Mj 

This means that, as the reproduction rate of 
the poor increases, each child of the poor 
will have a higher conditional probability of 
becoming poorer. More explicitly, given that 
a poor kid will fall into the poorest J class 
both before and after the change in M, 
(caused by an increase in F,), Assumption 3 
says that he is more likely to be poorer after 
F, increases. Assumption 3 is also intu- 
itively appealing. 


THEOREM 4: If Assumptions 1 and 2 hold 
and if the increase of F, worsens the upward 
mobility of the poor in such a way that As- 
sumption 3 is satisfied, then inequality (7) will 
hold. 


PROOF: 

We have shown in the proof of Theorem 
2 that, under Assumptions 1 and 2, a policy 
change from (F?, MÂ) to (F®,M®) will make 
the transmission of the CFSD property hold 
in all transition periods. Thus, all that is 
needed to establish Theorem 4 is to show 
that the change from (F®,M*) to (F®,M®) 
will provoke a CFSD change in income dis- 
tribution in the first period (see Case 1 of 
Appendix), and this is easy to establish un- 
der Assumption 3. 


C. Immigration and Emigration 


In the discussion of Section III, we did 
not consider the possible effects of migra- 
tion. Poor parents in rural areas may want 
to send their children to urban areas, where 
better job opportunity is expected. Rich 
parents in developing countries may want to 
send their children to developed countries 
to evade the unstable political and eco- 
nomic environment of their homeland. This 
kind of income-specific emigration has two 
different impacts on the dynamic income- 
transition structure of the outflowing coun- 
try: First, if x percent of the i-class children 
were to migrate out when they grow up, the 
net reproduction rate of the ith class could 
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become F; = F;(1 — x). Second, sending out 
some children may improve the capital-dilu- 
tion effect within a family and hence change 
the ith class’s mobility vector from M; to, 
say M.. Similar analysis can be applied to 
the new dynamic process generated by F 
and M. 

The case of immigration is more difficult 
to analyze because the behavior and pattern 
of immigrants are exogenous to our model. 
Suppose x/=[x,,,...,X,,] is the vector of 
total population that migrate to a society; 
then, the equation of motion in (4) will 
become P, = MFP,_,+x,. Once the rela- 
tionship between x, and P,_, is specified, 
this augmented dynamic structure can also 
be studied in a similar fashion." 


D. The Age Structure 


The overlapping-generation structure 
proposed in Sections II and III condenses a 
person’s life into two periods, which is 
clearly a simplification of human age struc- 
ture. More complicated age structure can 
be embodied in our basic framework by 
applying the multidimensional life table. Let 
us define an augmented transition matrix 
M, where M;,, _, describes the transition 
from income class į at age a to income class 
jat age a +1. Similarly, we can also include 
both age and income as arguments in the 
fertility function. Lam (1986a) provided a 
more detailed discussion about this exten- 
sion, and the ergodicity result of this aug- 
mented model was also shown to hold. 

One advantage of bringing in the age 
structure is that the possible interactions 
between family members’ age structure and 
family incomes will have a chance to appear 
in the model.'* The cost of including both 


“Ergodic results in an age-specific branching pro- 
cess with immigration have been analyzed by Thomas 
Espenshade et al. (1982). 

‘For example, it is Chayanov’s belief that there is a 
correlation between family size and farm size (and 
hence family income) and that this changes with the 
life cycle of the peasant family. Chayanov argues that 
this kind of life-cycle interactions is one of the impor- 
tant factors that affect the evolution of Russian peas- 
ant families. See, for example, David Grigg (1983) for 
more detailed discussion. 


” 
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age and income as state variables is that the 
transition rules among various states will 
become very complex, and it does not seem 
likely that we can find plausible and simple 
restrictions of fertility and mobility’ func- 
tions upon which interesting comparative 
dynamics about the steady states can be 
derived. The overlapping-generation struc- 
ture may be too highly simplified to guide 
our analysis, but it appears to be an appro- 
priate benchmark for the issue we are inter- 
ested in. 


E. Analysis of the Impact of 
Alternative Policies 


Although our analysis in Section III was 
confined to the impact of changes in F,, the 
same can also apply to changes in F, for 
i#1 and other policy experiments. From 
the Appendix, we see that as long as As- 
sumptions 1 and 2 are true, the transmission 
of the CFSD property will hold from period 
one onward. Thus, for policy experiments 
applied to an old steady state at period 
zero, it 1s quite possible to establish an 
CFSD comparison of steady states as long 
as the policy experiment in question can 
make the first-period distribution of income 
(ar,) exhibit CFSD to the original steady 
state (arg). Since all that is needed for such 
a comparison is the information at period 
zero, in the case of well-specified policy 
experiments, such information should not 
be hard to come by. 


F. Comparisons Between Assumption 2 
and Kalmykov’s SM Condition 


As Carl Futia (1982) pointed out, com- 
parative statics about how the invariant dis- 
tributions of a Markov process change when 
the transition probability function is altered 
are very difficult in general, and the only 
known results in related research require 
the SM assumption of Kalmykov (1962) and 
Daley (1968). Since the model discussed in 
this paper is a Markov branching process 
which is more general and more complex 
than Markov processes, it seems unreason- 
able to start analysis with any assumptions 
weaker than the SM condition. Notice that 
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the SM condition places stochastic domi- 
nance (SD) restrictions on columns of M, 
whereas our Assumption 2 places condi- 
tional stochastic dominance (CSD) restric- 
tions on columns of M. We summarize the 
conclusions obtained so far in the first two 
rows of Table 1. 

Two points should be mentioned here. 
First, we have been given an example in 
which both Assumption 1 and SM are satis- 
fied but no comparative dynamic results can 
be obtained." This makes us believe that 
the SM condition, which works for Markov 
processes in Daley’s study, is insufficient to 
derive similar results for Markov branching 
processes. However, it is easy to show that, 
if one is willing to replace Assumption 1 
with a stronger version as in the third row 
of Table 1, then the comparative dynamic 
results can still be derived.'© Second, al- 
though Assumption 2 is intuitively appeal- 
ing, as we argued in Section III, it indeed 
requires more restrictions on entries of the 
mobility matrix than the SM condition.!” It 
would be interesting to investigate whether 
any results can be obtained with assump- 
tions different from the SD type, or whether 
a comparative static (instead of comparative 
dynamic) result can be obtained with the 
usual SM condition. These seem to be 
promising directions for future research. 


VI. A Numerical Example 


As a last part of our illustration, a numer- 
ical example is used to work out various 
comparative statics. In the process, we also 
introduce a method that can efficiently gen- 
erate various steady-state results. Let us set 


0.45 0.25 0.10 0.05 0.00 
0.25 0.40 0.20 0.15 0.10 
M=/0.15 0.20 0.35 0.25 0.20 
0.10 0.10 0.25 0.35 0.30 
0.05 0.05 0.10 0.20 0.40 


‘The example was provided by Christophe Lefranc. 
Details are provided in Chu (1989), which is avail- 
able upon request. 
For instance, it is easy to verify that the ex- 
ample in Lam (1986a p. 1112) satisfies SM but not 
Assumption 2. 
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TABLE 1—AssUMPTIONS AND CONCLUSIONS DERIVED 
Source Assumptions Results 
Kalmykov (1962) D) Fy Fy Ss =F, SD relation established 


Theorem 2 of this Paper 1) Fj2F,2°°: >F, 


2) SD on columns of M 


2) CSD on columns of M 


for.comparative dynamics 


CSD relation established 
for comparative dynamics 
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Chu (1989) D Fj2F,=F,;=°-: =F, SD relation established 
2) SD on columns of M for comparative dynamics 
TABLE 2—STEADY STATES OF THE NUMERICAL EXAMPLE - 
Case A Case B l 
(F, = 1.00, i=1,...,5) (F; =1.01, F; = 1.00, i=2,...,5) 
yw’ (0.447, 0.447, 0.447, 0.447,0.477]  [0.453, 0.447, 0.446, 0.445, 0.445] 
at [0.167, 0.228, 0.238, 0.220,0.146)] [0.168, 0.229, 0.238, 0.220, 0.146] 
g* 1.000 1.002 
K=n*'p 0.447 0.447 
v= n"/K [C.374,0.511,0.533, 0.492, 0.327] [0.376, 0.511, 0.533, 0.491, 0.326] 
iy 0.167 0.170 
AeF, /(AF,8*) 0.168 0.169 


and the steady states will be calculated for 
two cases: (A) F,=1, i=1,...,5 and (B) 
F,=1.01 and F,=1, i=2,...,4; that is, case 
(B) expands the poor’s reproduction rate by 
one percent. Given the above information, 
one can iterate (5’) and (6) to generate the 
steady state (gô*,m^*) and (g®*,a®*) for 
each case. Furthermore, for any column 
vector not orthogonal to m*, a pair of se- 
quences is defined as follows: 


(17a) X=My,_, t2l 


(17b) y,=(X/X,) X, t20. 


It has been shown that the sequence y, will 
converge to +p, where u is the right 
eigenvector corresponding to g*. Finally, 
we calculate w*p,=K and set v=w*/K. 
Since v so obtained will satisfy the normal- 
ization condition vp = 1, the value v; u; can 
be used to calculate the elasticity presented 
in Theorem 3. The results are summarized 
in Table 2. 


18See Mode (1970 p. 104) for detail. 


From (17b) and the property that y, > 
+p, it is clear that the p vector has 
been normalized with L,47=1, which to- 
gether with the condition wp =1 in Theo- 
rem 1 uniquely determines the steady-state 
(p, v) pair, From Table 2 one can verify two 
things: (i) w“* exhibits CFSD to w®*, and 
Gi) pout <(Ag/AF,). (FA/24*) < 
(Ag/AF, MFP /2®*) < uBvB. It can be 
checked that, when the experimental reduc- 
tion in F, gets smaller, the formula in The- 
orem 3 will provide a more accurate predic- 
tion of the change in population growth. 


VIL. Conclusions 


This paper examines the comparative dy- 
namic relationship between income distri- 
bution and the reproductive rate of the 
low-income group. The problem is an im- 
portant one, because many economists have 
argued that income inequalities in develop- 
ing countries are caused by high population 
growth and that high population growth 
rates in most developing areas are due to 
the high reproductive rate of the poor. Our 
main finding is that, under some fairly rea- 
sonable assumptions, a reduction in the re- 
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productive rate of the poor will cause a 
conditional stochastic dominance improve- 
ment in income distribution in the steady 
state as well as in all the transition periods. 
This result provides us with very strong the- 
oretical support in favor of family-planning 
programs that encourage the poor in devel- 
oping countries to reduce their reproductive 
rate. Furthermore, we derive an easy for- 
mula ‘for calculating the elasticity of a 
change in the poor’s fertility rate on the 
Steady-state population growth rate. -Tech- 
nically, the analysis in this paper is an 
extension of Kalmykov’s work on Markov 
processes to multitype Markov branching 
processes. l 


APPENDIX 


The following lemma is needed to prove 
(7). 


LEMMA: 


pPA+(1-p)E qC+(1-q)E 


Al) —— > m 
l ) BrO- p)F qD +(1—-4)F 


a 
—_—>— > — . 
"BOD F 


d A2C, 
3 an 


where A, B,...,F >0, and 1> p2q20. 


The proof of the above lemma is straight- 
forward algebra and hence will not’ be pre- 
sented here. Technical details are available 
from the authors for interested readers. 

Now we will proceed to prove Theorem 2. 


PROOF OF THEOREM 2: 
Case 1, when t =1: 


i 
Eisi 

J . 
j=j 





Wy o( Fy, + SEZ Mat 
mi olf, + 8)Dj. Mat 


+ oF dja M 
J 
at Ty oF nei 1 Mj on 


I hed 
11 oF Lijae Mi + 
© oF Lye Mat + 


T 
+ Tn OF dai = 1Min g 
J 
Tn ofni = 1M; 
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by Assumption 2 


I 
= d= Tio 


J * 
Lie 17,0 


Case 2, when t > 1: Assuming 


Svar, yor, 
bel i, t—1 i=1"7,0 
i a 
. j=1Fj,t-1 j=17}j,0 
we want to show 
I I 
ae 1734 Lja 177 59 
wo. zp T Iss 
D Lrj o 
or equivalently, to show 
yoa Llr, 
i=1" i,t i=1“ i0 
(A3) I<n. 


Sila. t - Par 0 
The left-hand side of (A3) is 
(A4) 


Ti 1(Fi + 89D jy Mi + 
Ti, = CF, +8 ML Mae 


e ETa Fn Eim Min 
EG Ensi TIM, 


A constructive proof of (A3) is given below. 
First, consider only the first two terms 
in (A4): 
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Similarly, the first two terms of the right- 
hand side of (A3) could be written as 


(119+ 72,9) FC 


A6) Be OEE L ie 
( (Tio t 729) FD 


where 
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From Assumptions 1 and 2, the Lemma, 
and (A2), it is clear that 


(A7a) 


which gives (A5) > (A6), and 
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- Now, we will proceed to the case with the 
first three terms in (A3). The first three 


terms in the left-hand side of (A3) could be 


rewritten as 
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where 
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Similarly, the first three terms in the right- 
hand side of (A3) could be rewritten as 


(tro + T20 + T30) FC 
AD e 
(Tiot 729+ 73,9) FD 


where 
c=] Tio ETag ea 
Tio + T29+ 73 9 F, 
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Again, from (A2), Assumptions 1 and 2, and 
(A7), we have 


A C' 
BE? D' 
which gives (A8) > (A9), and 


(A10a) 
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Expressions (A10) will facilitate the next 
step with four terms. 
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Productivity, Health, and Inequality in the Intrahousehold 
Distribution of Food in Low-Income Countries 


By Mark M. Pritt, MARK R. ROSENZWEIG, AND Mp. NAzMuUL Hassan” 


A model is formulated incorporating linkages among nutrition, labor-market 
productivity, health heterogeneity, and the intrahousehold distribution of food 
and work activities in a subsistence economy. Empirical results, based on a 
sample of households from Bangladesh, indicate that, despite considerable intra- 
household disparities in calorie consumption, households are averse to inequality. 
Furthermore, consistent with the model, the results also indicate that both the 
higher level and greater variance in the calories consumed by men relative to 
women reflect in part the greater participation by men in activities in which 
productivity is sensitive to health status. (JEL 824, 122, 850). 


A prominent if not distinguishing feature 
of low-income countries that has been in- 
corporated into many models of behavior in 
such settings is the proximity of average 
income levels to subsistence. Models of sav- 
ings behavior (Mark Gersovitz, 1983) and 
wage determination (Harvey Leibenstein, 
1957; Joseph Stiglitz, 1976; Partha Dasgupta 
and Debraj Ray, 1984), for example, have 
demonstrated the possibility that behavior 
at low income levels may be quite distinct 
from that observed when income levels are 
well above those required for survival. 
Low-income societies are also characterized 
by an occupational distribution in which ac- 
tivities requiring high levels of energy ex- 
penditure predominate, and a number of 
recent studies have shown that health and 
food consumption directly affect productiv- 
ity and wage rates in low-income environ- 
ments (John Strauss, 1986; Anil Deolaliker, 
1988; Jere Behrman and Deolalikar, 1989). 
In a subsistence regime, the allocation of 
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food is thus particularly important, and the 
measurement of the overall level of inequal- 
ity in low-income countries must take into 
account how households in such environ- 
ments distribute food among their individ- 
ual members. 

One salient aspect of the distribution of 
food in low-income settings that has caught 
the attention of many social scientists is the 
disparity in nutrients received by women 
compared to men, particularly in South and 
West Asian societies.’ One hypothesis that 
has been advanced is that gender-based nu- 
trient inequality reflects disparities in 
labor-market opportunities between men 
and women in these settings, with the pecu- 
niary returns to a household from the allo- 
cation of food to women being less than 
those for men. Indeed, some empirical stud- 
ies have shown the existence of a relation- 
ship between sex differences in infant mor- 
tality rates and differences in labor-market 
participation rates between men and women 
(Pranab Bardhan, 1974; Rosenzweig and T. 
Paul Schultz, 1982). However, there is little 
direct evidence of a relationship between 
the actual intrahousehold distribution of 
food across individuals and labor-market ac- 
tivities; nor is there a clear theoretical link- 
age established between labor-market char- 
acteristics and patterns of intrahousehold 


For an extensive review of the literature concerned 
with gender inequality and the intrahousehold distribu- 
tion of food, see Behrman (1990). 
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TABLE 1—-HOUSEHOLD DISTRIBUTIONS OF CALORIES BY AGE AND SEX IN BANGLADESH 


Age <6 6 < Age < 12 Age > 12 
Statistic Males Females .X7(d.f.) Males Females X7(d.f.) Males Females X¥7(d.f.) 
Mean household 891 751 2.35 1,549 1,536 0.25 2,672 + 2,063 609.1 
calorie consumption (217) (220) (465) 
Mean household 43.6 41.1 0.26 11.1 10.5 0.23 11.5 7.05 4.48 
coefficient of variation ~- (38) (29) (143) 


Source: Nutrition Survey of Bangladesh, 1981-2. 


food allocation that may arise in low- 
income environments.’ 

Although attention has mainly focused on 
gender inequality in food allocation, if the 
relationship between healthiness- and pro- 
ductivity differs across occupations and ac- 
tivities, the distribution of activities across 
individuals within gender classes should also 
be related to the intrahousehold distribu- 
tion of foods. Table 1 presents the means of 
average household calorie consumption and 
the intrahousehold coefficient of variation 
in calorie consumption by age and sex for a 
probability sample of 345 households from 
15 villages in Bangladesh.’ These figures 
show that, while there is a large GO per- 
cent) and statistically significant difference 
in the average number of calories allocated 
to men and women aged 12 and above, 
there is no difference between sexes in mean 
calories consumed by children ages 7 
through 11. For children aged 6 and below, 
boys on average receive more calories than 
girls, however. Gender differences in aver- 


"One important study of the intrahousehold distri- 
bution of nutrients (Behrman, 1988) finds no apparent 
link between expected labor-market opportunities and 
sex disparities in nutrient’ consumption. However, this 
study only considers the allocation of foods among 
children less than 13 years of age, a large proportion of 
whom do not participate in the labor market and, 
perhaps more importantly, among whom there may be 
little differentiation with respect to work activities. For 
this group, the link between the labor market and food 
consumption can only be indirect and is in any case not 
explicitly modeled. 

3Later in this paper, we describe the characteristics 
of this data set. Calorie consumption in Bangladesh is 
a good indicator of overall nutrient consumption, given 
the simplicity of the Bangladeshi diet, as discussed in 
Section IJI. The sex- and age-specific coefficients of 
variation are computed only for those households with 
two or more individuals in each group. 


age calorie consumption are thus highly 
age-dependent. Table 1 also shows, more 
interestingly, that mean within-household 
inequality in food consumption, measured 
by the coefficient of variation, is 64 percent 
higher among males aged 12 and over than 
among females of the same age. Among 
children less than 12 years of age, however, 
inequality in calorie consumption among 
boys and girls is similar.* 

Table 2 displays the distribution of activi- 
ties ranked by their energy requirements, 
within the same sex and age groups. These 
figures demonstrate that stratification by ac- 
tivities also varies by age and sex and in 
large part parallels what is observed in Table 
1 for calorie consumption. The similarity in 
energy intensity and diversity of activities 
exhibited by girls and boys in the below-six 
and 6-to-12 age groups mirror the similarity 
in the mean and variability in calorie con- 
sumption among boys and girls in those age 
groups exhibited in Table 1. Furthermore, 
the large disparities in participation rates in 
high-energy-intensive activities between men 
and women aged 12 and over are consistent 
with the gender differences in the variability 
of calorie consumption depicted in Table 1 
for that age group. 

Tables 1 and 2 are suggestive of a direct 
linkage between the type of work activities 


“Similar patterns characterize the Indian village data 
used by Behrman (1988). He shows that there are no 
sex differences in average nutrient allocations for chil- 
dren younger than 13, the subset of the population he 
studies. However, using the same data set, we find that 
for individuals aged 13 and above, mean calorie con- 
sumption is 12 percent higher for males. The variance 
in consumption among males is 15.6 percent higher 
than it is among females in that age group. Both of 
these differences are statistically significant. 
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TABLE 2——-PERCENTAGE ACTIVITY DISTRIBUTION BY ENERGY REQUIREMENTS, 
AGE, AND SEX 
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Age <6 6 < Age < 12 Age > 12 

Energy requirement Males Females Males Females Males Females 
Insignificant 98.7 99.3 70.5 69.1 26.8 20.6 
Light 1.3 0.7 28.8 25.6 22.6 8.5 
Moderate 0 0 0 4.5 2.82 68.2 
Very high 0 0 0.7 0.8 31.9 1.2 
Exceptionally high 0 0 0 0 13.9 1.5 

Sample size (N) 133 129 140 133 433 473 

X? (d.f) 0.28 (1) 6.87 (3) 625.4 (4) 

Pp’ 0.600 3.076 0.0001 


Significance level (probability). 


and the intrahousehold distribution of food 
consumption, but they of course do not ex- 
plain the diversity in activities within gender 
groups. In this paper, we examine the rela- 
tionship between the household distribution 
of foods and labor-market activities in the 
context of a model incorporating (i) linkages 
among food consumption, health, and la- 
bor-market productivity and (ii) individual 
heterogeneity in inherent or “endowed” 
healthiness. The model takes as given dif- 
ferences in the opportunities for work activ- 
ities by gender; conditional on the cir- 
cumscribed activities of women, it yields 
implications for how the distribution of in- 
dividual health endowments and nutrition- 
productivity linkages influence the distribu- 
tion of food and energy expenditure (effort) 
across individual members of a household 
and provides a method for measuring gen- 
der-based discrimination by the household. 
Section I presents the model. Section II 
discusses the methodology used to compute 
individual endowments, and Section HI re- 
ports estimates, based on a sample of 
households from 15 villages in Bangladesh, 
of the effects of food consumption and ac- 
tivities on weight-for-height, the effects of 
health endowments on calorie consumption 
by sex and age, and the effects of endow- 
ments on activity choice and income. 

The empirical results appear to be consis- 
tent with the hypothesis that work activity 
distributions substantially influence the in- 
trahousehold distribution of food. In partic- 
ular, the greater participation by men in 


energy-intensive activities in which health 
status may importantly influence productiv- 
ity is in part responsible for both the higher 
level of calories consumed by adult men and 
the greater variance in calories consumed 
among men compared to women. We are 
able to infer from our estimates, however, 
that households are averse to inequality in 
health outcomes, with men bearing slightly 
more of the “cost” of equalization than 
women as a consequence of their participa- 
tion in activities requiring high energy 
levels. 


I. Theory 


To analyze the relationships among the 
distribution of food, health, and labor- 
market activities, we set out a framework 
describing the allocation of food and the 
choice of labor-market “effort” across het- 
erogeneous individuals residing in inte- 
grated household units, defined by common 
objective functions. For simplicity we as- 
sume that there is only one food or nutri- 
ent. The model can be readily extended to 
incorporate multiple foods and nutrients 
with no alteration in its basic implications. 
The health status Af of an individual i 
among a class of individuals k is assumed to 
be influenced by food consumption c; and 
by effort e, expended in some work activity. 
In general, the effects of these variables on 
health may be nonmonotonic. However, we 
assume that in a subsistence economy food 
augments health, while effort decreases 
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health (stamina} such that 


(1) hy = h* (Ci, €i, 1) 
ahk - , oat o 
— > oni 
ðc; ` de 


where u; is the endowed health of an indi- 
vidual, that component of health influenced 
by neither consumption nor effort. 

Effort is rewarded in the labor market, 
with the returns to effort increasing with 
health status. The wage rate, w* for an 
individual i in the class of individuals k is 
given by the following:° 


(2) we ij w*(e;, h;) 
awk awk. a? wk 
ee >0 





Individuals are assigned to classes (age, sex) 
by the characteristics of the health and ef- 
fort wage functions so that every member of 
each class has the same h and w functions; 
individuals are’ individually differentiated by 
their health endowments m; which are 
known to all family members. 

Expressions (1) and (2) capture the-essen- 
tial assumption of the nutrition-wage litera- 
ture: that food consumption augments la- 
bor-market productivity, presumably via 
health status. However, while the nutrition- 
based efficiency wage literature assumes a 
purely technological relationship between 
effort and health (or food consumption), 
here both food consumption and labor. ef- 
fort are choice variables.® Moreover, we al- 
low the wage function to differ across classes 
of individuals, which may result from their 


>We also assume that the marginal product of health 
vanishes if there is no effort, so that the second deriva- 
tive of health in the wage function is zero. If (2) is 
quadratic, for example, then w = A(e + ye?). 

We assume that work time is fixed (and set to 
unity) as is conventially assumed in the nutrient-wage 
literature. It is possible to include home production 
activities in total work time, with (2) being replaced by 
a goods-production function, with no alteration in the 
basic implications of the model. Our data set contains 
no information on the amount of time allocated to any 
‘activity. 
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allocation to particular sets of activities. 
Thus, for example, in India, few women 
engage in plowing; in Bangladesh, no women 
are observed pulling rickshaws. The rela- 
tionship between health and the returns to 


‘effort are likely to be quite different in 


those activities in which both women and 
men participate. Indeed, Behrman and 
Deolalikar (1989) and David Sahn and 
Harold Alderman (1988), based on data 
from India and Sri Lanka, respectively, 
found that health (measured as weight-for- 
height) and calorie consumption had signif- 
icant positive effects on the wage rates of 
men but not women. 

The allocations of food and work effort 
across individuals in a household unit are 
determined from the solution to the maxi- 
mization problem - 

(3) max ae AE ck,...,0% 


cf E 


nk 


re k=1,...,m 


subject to 
(4) b+ De Dew? -PE Ed =0 


and ea (1) aud D, where v= 
nonearned income and p is the price of the 
food good. In the household welfare func- 
tion (3), it is assumed that increases in both 
health status and food consumption aug- 
ment utility, while increases in work effort 
lower utility. 

The necessary first-order conditions for 
the allocation or assignment of food and 
work effort to individual i of class k are 


: au \ { ah* ôU 
SEES Gc, | 7 ack 


t 


À ðU \[ah*\ © au 
9 (ank) e; | * oe 


dw* | dw*\ { an* 
=~ al——+{—— || — 
de; oh; }\ ðe; 





VOL. 80 NO. 5 


where A = marginal utility of income. Con- 
dition (5} states that the marginal cost of 
allocating an additional unit of food to per- 
son ¿i is lower the greater the extent to 
which health augments work efficiency. 
Thus, if the members of class / participate 
in activities for which the market returns to 
health are greater compared to those activi- 
ties in which members of class k participate 
(who are otherwise identical), then on aver- 
age class-! individuals will receive higher 
allocations of food than will class-k individ- 
uals. Since we assume for simplicity that 
work (whether market or nonmarket) time 
is the same for all individuals {there are few 
idle women in low-income countries), it is 
not market work time (or even the average 
wage rate) that matters for food allocation 
(as in Rosenzweig and Schultz [1982]), but 
the type of activity engaged in, as defined by 
the wage-effort-health association. 

Within a class, the distribution of food 
and work effort across individuals will de- 
pend on the distribution of endowments. To 
highlight the roles of both health in the 
labor market and household preferences in 
influencing these distributions, assume that 
the endowment is additive in (1). Thus, dif- 
ferences in endowments do not influence 
the health returns to food consumption. 
Consider first a model, nested in (3), in 
which household income is maximized. The 
maximand is the left-hand side of expres- 
sion (4), and the necessary first-order condi- 
tions are given by (5) and (6) with the left- 
hand side of each expression replaced by 
zero. In the income-maximizing model, the 
relationships between the endowment of an 
individual ¿ in class k and that person’s 
allocation of food and work effort are given 
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where 


r aw*\ { a7h* 
| ah, |\ ac,dc; 


awk ð aw \{ank\” 
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dee, \ deh, || de; 




















a*w* \ f an* aw*\ { a*h* 
pam o + m 

de,0h, || Oc; dh, | \ 0e,0c; 
awk faw*\{ a*n* r 
<| Seah, Von, deac)? 


As indicated by equation (8), under the 
income-maximization regime those individu- 
als with greater endowments of health 
supply more effort, because health aug- 
ments the labor-market returns to effort 
(d*w* /ðe;ðh; > 0). More-endowed individu- 
als also receive more food because food 
increases health, which increases the re- 
turns to effort, and because effort depletes 
health status; increased food consumption 
both compensates for and enhances the re- 
turn from increased effort. Thus, those indi- 
viduals exerting greater effort or in effort- 
intensive activities will also be consuming 
more food. Moreover, those classes of indi- 
viduals in activities for which the returns to 
work effort are more sensitive to health 
status will be characterized by greater dif- 
ferences in food consumption (and effort) 
compared to an otherwise identical class of 
individuals with the same distribution of 
endowments. This is because the magnitude 
of the (positive) endowment—food-con- 
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sumption slope in (7) depends positively on 
the degree to which health augments mar- 
ket returns to effort. 

In the utility-maximization model (3), the 
relationships among own endowments, food 
consumption, and work effort are given by 


; dck awk \ fah \ | ank\ 
) aa O a || Ge, | II Ge, 





a7 y* 
(Seo) Seam} 


dek |aw* {dw* 
(10) —= =| — +| —— 
H i 


g 
a 
| 


| a a°w* def pa 
(Sae) sean, | * o | an, 


where dc*/dv and de*/dv are income ef- 
fects on food and effort, S,.. and S, e, are 
the Hicks-Slutsky compensated own substi- 
tution effects (negative and positive, respec- 
tively), and S,.. is the Hicks-Slutsky cross- 
compensated substitution effect, which is 
negative if-effort (a “bad”’) and food con- 
sumption are substitutes. The first of the 
three right-hand-side terms in (9) and (10) 
arises from the welfare function in (3). This 
term indicates that the relationships among 
own endowments, food consumption, and 
effort depend on the relative magnitudes of 
substitution and income effects. If income 
effects are small, then in the absence of 
labor-market returns, higher-endowed indi- 
viduals receive less food and provide more 
labor-market effort. Some of their higher 
health is thus taxed away via both the food 
and effort allocations; low-endowment indi- 
viduals are “compensated” for their low 
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endowments by higher food and lower effort 
allocations. 

The last two right-hand-side terms in (9) 
and (10) arise because of the health-effort 
interaction in the labor market. Both of 
these terms are positive in the food-allo- 
cation equation (9), given that food is a 
normal good. Thus; the association between 
own endowments and food consumption will 
be algebraically higher the more strongly 
health augments the returns to effort. If 
women are barred (cr refrain) from partici- 
pating in activities in which health status 
strongly affects productivity, then compen- 
sation (reinforcement) with respect to food 
is more (less) likely than among men. 

We note that defining compensation with 
respect to the sign of the relationship be- 
tween own endowments and an individual- 
specific input, such as foods, can be mis- 
leading when more than one allocated good 
affects health status and welfare. An alter- 
native method of gauging compensation, and 
of more meaningfully ‘assessing the differ- 
ential treatment of different classes of indi- 
viduals by the household, is to examine the 
net change in health status associated with 
a change in endowment. This is given by 
(11) in the additive endowment ¢ase: 
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If the sum of the last two terms in (11) is 


negative (positive), then compensation (re- 


inforcement) with respect to health occurs 
for group k; reinforcement with respect to 
foods is thus not inconsistent with a house- 
hold’s aversion to inequality in health sta- 
tus. Expression (11) may differ across 
groups; intergroup differences in (11) thus 
are a measure of net discrimination across 
groups by the household with respect to 
health that incorporates both food and ef- 
fort allocations. 

While it is clear that the signs of the own 
endowment effects on food and effort do 
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not necessarily distinguish between the in- 
come- and welfare-maximizing models, the 
existence of cross endowment effects can 
only arise when household welfare is being 
maximized (and the welfare function is not 
linear in its arguments). It is straightforward 
to show that, although such effects cannot 
be signed in general, the cross effect of j’s 
endowment on ¿s food consumption is more 
negative (given that the consumption of i 
and j are substitutes in the household wel- 
fare function) the stronger is the relation- 
ship between health and effort productivity 
for j.’ Thus, the cross effect of a woman’s 
endowment on a man’s food consumption, 
given the gender differences in activities 
exhibited in many South Asian societies, 
will be algebraically greater than will the 
effect of a change in a man’s endowment on 
the woman’s food allocation, while own en- 
dowment effects will be algebraically greater 
for males. Knowledge about beth the health 
technology and the role of health in aug- 
menting productivity is thus critical for un- 
derstanding the determinants of the alloca- 
tion of foods and effort levels. 


II. Estimating the Relationships between 
Endowments and Household 
Resource Allocations 


To estimate the association among the 
endowments of members of a household, 
their food consumption, and their expendi- 


7The change in the endowment cf individual j in 
class / on the food consumption of individual i in class 
k in the welfare-maximization model, is given by 
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The cross effect of f’s endowment on is food con- 
sumption is more negative (given tkat e; and c; are 
substitutes in the household welfare function) the 


stronger is the relationship between health and effort 
productivity for j. 
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ture of effort, we employ a method first 
used in Rosenzweig and Schultz (1983), in 
which the health technology (1) is estimated 
directly, and based on the technology pa- 
rameter estimates and the actual resources 
consumed or expended by each individual, 
individual-specific endowments are com- 
puted. There are two problems with this 
“residual” endowment method. First, if en- 
dowments, which are not directly observed 
by the researcher, influence resource alloca- 
tions, consistent estimates of the household 
production technology cannot be obtained 
using least squares; that is, c;, e; and the 
unobserved u; will be correlated in (1). One 
method of identifying the technology is to 
use instruments. In this case, food prices, 
labor-market variables reflecting labor de- 
mand, and exogenous components of in- 
come determine resource allocations but do 
not directly affect health status, given food 
and activity levels. 

A second, less well-recognized problem 
with extracting estimates of endowments 
from estimates of the technology, which 
arises «ven when the technology is esti- 
mated consistently, is that the derived en- 
dowments will be measured with systematic 
error. Contrary to the assertions in Rosen- 
zweig end Schultz (1983), the measurement 
errors in estimated endowments are not 
likely to be random, because the technology 
inputs, in this case individual-specific levels 
of nutrients, are unlikely to be measured 
without error. Endowment effects estimated 
by least squares are thus unlikely to be 
consistent, and the biases cannot necessarily 
be signed a priori. 

To see the measurement-error problem 
and one solution, assume for simplicity that 
the health-production function contains only 
one nutrient (calories). The true (measured 
without error) endowment y* is thus 
(12) pF = H* —CAT i=1,...;n 
where H;* and C* are the (unobserved) 
true values of health and calorie consump- 
tion, respectively, and T is the calorie effect 
on hezlth. Assume that the observed values 
H and C have measurement errors u; and 
e; with classical errors-in-variables proper- 
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ties; that is, 
(13) 
(14) 
where E(H;*,u;)=0, E(C}#,e;)=0, and 


E(u,, e;) = 0. Therefore, the estimated en- 
dowment £2, is 


C,=CF +e; 


(15) f,=( A + u;)— (C> + e,)T 
= pF tu et 


where Î is the two-stage least-squares esti- 
mate of the calorie effect, and the endow- 
ment measurement error is v; =u, — eT. 

If there are no other observables or unob- 
servables, the estimated (linear) calorie allo- 
cation equation is 


(16) 


The least-squares estimator of b, if the true 
health endowments are nonstochastic and 
(1/n)Lp2 converges as n >œ to a positive 


finite limit Oy 18 


CT 
17) plimb= eT eae 
(U pom a a 


The first term in (17) corresponds to the 
classical error-in-variables bias. However, 
the second term appears because of the 
indirect estimation of the endowment from 
(12). When calorie consumption has a posi- 
tive marginal product in the production 
of health (> 0), then from (15), o, <0. 
Thus, if b is positive, ô will underestimate b 
unambiguously, but if the true endowment 
effect is negative, the sign of the bias is 
indeterminate. The (biased) least-squares 
estimator of the error-ridden endowment 
effect would tend to reject reinforcement 
with respect to calories if it in fact were 
true; errors in measurement in calories will 
make households appear to be more com- 
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pensatory with respect to calories than they 
really are.® 

Consistent parameter estimates in the 
presence of errors in variables can be ob- 
tained by using instrumental-variables 
methods. Repeated observations on the 
measured-with-error variable or the avail- 
ability of different but related indicators of 
the phenomena to be measured (for exam- 
ple, multiple-proxy variables) are potential 
sources of instruments. If individual-specific 
food intake and anthropometric measures 
of health were measured at more than one 
point in time, then even if all measurements 
of (calorie) consumption are made with er- 
ror, all that is required for consistent instru- 
mental-variables estimation is that the pe- 
riod-specific errors be uncorrelated across 
time periods. Moreover, if there are avail- 
able multiple indicators of health and thus 
of endowments in addition to repeated 
measures, they can be used as well, as long 
as the measurement errors in each endow- 
ment type are uncorrelated across time pe- 
riods (noncontemporaneously) with the 
health-endowment measurement error. 


HI. Empirical Results 


A. Data and the Estimation of the 
Health Technology 


As the previous discussion has made clear, 
to obtain direct estimates of endowment 
effects requires data that not only provide 
individual-specific information on health 
and consumption but contain i) sufficient 
cross-sectional variation in exogenous vari- 
ables needed as instruments for estimation 
of the health technology and (ii) repeated 
observations on individuals to purge esti- 
mated endowments of measurement errors. 
The 1981-2 Nutrition Survey of Rural 
Bangladesh (Kamaluddin Ahmad and Has- 
san, 1986) provides information on individ- 
ual-specific food consumption and anthro- 


®The parameters associated with all other regressors 
measured without error are also biased; the sign of 
their bias can be determined from the variance-covari- 
ance matrix of the observations (Maurice Levi, 1973). 
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pometric measures of health along with 
other individual and household attributes 
for 385 households in 15 villages scattered 
throughout Bangladesh.” Intrahousehold 
food-consumption information was col- 
lected once for 25 (out of 50 sampled) 
households in each of 12 randomly selected 
villages and in 35 (out of 70 sampled) 
households in an industrial town. In addi- 
tion, the same information was collected at 
four separate times within a year for 25 (out 
of 50 sampled) households in each of two of 
the remaining villages. These Bangladesh 
data thus permit estimation of the health 
technology from the cross-sectional sample, 
as well as estimation of endowment re- 
sponses purged of measurement error, based 
on the longitudinal component of the data 
set. 

The intrahousehold dietary information 
was collected by specially trained female 
dietary investigators who measured dietary 
intake by weighing each individual’s intake 
in the home over a 24-hour period. All 
individuals covered by the dietary survey 
were also examined by a clinician, who ob- 
tained measures of weight, height, skinfold 
thickness, and mid-arm circumference. In- 
formation was also obtained on the occupa- 
tion of each household member, and the 
energy intensity of his or her activity was 
coded using guidelines established by the 
Food and Agriculture Organization and the 
World Health Organization (see Appendix 
A). The prices of a wide variety of foods 
sold in the village market were separately 
obtained in the survey, so that there is one 
price per commodity per village. 

We use the information on weight-for- 
height to measure health, which is consid- 
ered a good short-run measure of nutri- 


°The data from one additional village of hill tribes 
(who are not racially or ethnically related to Bengalis) 
were not used in our analysis, as their dietary and 
other behaviors are considered too unlike those of 
ethnic Bengalis. 

Hassan (1984) has compared the nutritional infor- 
mation in the survey with that collected in prior nutri- 
tion surveys in Bangladesh (and East Pakistan) to draw 
inferences concerning trends in Bangladeshi health 
and food consumption. 
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tional status that will be sensitive to daily 
food consumption and activity levels. To 
estimate the health technology (1), food 
consumption was converted into nutrient 
intakes using conversion factors specific to 
Bangladeshi foods (Institute of Nutrition 
and Food Science, 1980). Calorie consump- 
tion, however, would appear to be a suffi- 
cient indicator of nutritional intake. The 
typical Bangladeshi diet is very simple; cere- 
als account for 87 percent of calorie con- 
sumption, as well as 78, 82, 84, 70, and 82 
percent of the consumption of protein, iron, 
thiamine, riboflavin, and niacin, respec- 
tively. As the consumption of each nutrient 
is a linear function of all foods consumed, 
the large share of consumption derived from 
just one food group makes the set of ob- 
served nutrient intakes nearly perfectly 
collinear. Moreover, we would expect that 
weight-for-height, as an indicator of short- 
run health, should not respond substantially 
if at all to the intake of any nutrient except 
calories. Daily changes in weight reflect the 
difference between calories consumed and 
calories expended. 

To reflect calorie outflow, we add to indi- 
vidual-specific nutrient consumption in the 
weight-for-height production function two 
dummy variables reflecting participation in 
occupations categorized as “very active” or 
“exceptionally active” based on the occupa- 
tional data.!! We add as well dummy vari- 
ables indicating whether a woman was preg- 
nant or lactating at the time of the survey. 
Exogenous regressors included are age, age 
squared, sex, the interaction of sex and age, 
and a set of dummy variables indicating the 
source of the household’s drinking water 
(well, pond, tube well, or river/canal). In 


MA possibly superior procedure would have been to 
employ individual dummy variables for each of the 14 
occupations provided in the data. Because we treat 
occupation as a choice variable, however, we would 
need more instruments than we have available to iden- 
tify all of the individual activity effects on weight-for- 
height. The FAO/WHO categories provide a parsimo- 
nious way of representing occupations in terms of their 
consequences for short-term health. If the categoriza- 
tion is correct, our estimates are more efficient than 
those that would be obtained from the more agnostic 
specification, if it could be estimated. 
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accord with the model, we treat nutrients, 
activities, pregnancy, and lactation as en- 
dogenous variables and estimate the pro- 
duction function using two-stage least 
squares. Identifying instruments are the 
household head’s age and schooling level, 
household landholdings, and the village food 
prices interacted with household: landhold- 


ings, the head’s schooling and age, and the - 


individual age and sex variables. The food 
prices are those for rice, wheat flour, pota- 
toes, leafy vegetables, okra, green chilies, 
sugar and sweets, eggs, mustard oil, pulses, 
fish, milk, onions, garlic, and meat.!* 

Table 3 presents both (inconsistent) ordi- 
nary least squares (OLS) and consistent 
two-stage least squares (2SLS) estimates 
of the parameters of the Cobb-Douglas pro- 
duction function for weight-for-height. 
These estimates ‘are obtained using the 
cross-sectional component of the data de- 
scribing the full set of 15 villages (with one 
round from each of the two multiple-round 
villages). The calorie élasticity is seriously 
underestimated by OLS, although it is posi- 
tive and statistically significant using either 
procedure. Moreover, the OLS estimates of 
the effect of the energy intensity of effort on 
weight-for-height are of the opposite sign to 
the consistent 2SLS estimates, indicating a 
possible strong relationship between activity 
choice and the health residual, containing 
the endowment. The 2SLS estimates indi- 
cate that increased calorie consumption sig- 
nificantly increases weight-for-height and 
demonstrate that participation in exception- 
ally active occupations tends to deplete 
weight-for-height, although: the estimated 
activity coefficient has'a relatively large 
standard error.-The less active occupations 
categorized as “very active” have an esti- 
mated coefficient only one-eighth that of 
“exceptionally active” occupations. 

We also tested whether calorie consump- 


tion was a sufficient statistic for nutrient. 


consumption and-whether the calorie elas- 
ticity differed between males and females. 
We could not reject the null hypothesis that 


12 Pitt (1983) shows that nutrient consumption is 
significantly responsive to food prices in Bangladesh. 


DECEMBER 1990 


TABLE 3—-EFFECTS OF CALORIE CONSUMPTION, 
ACTIVITY LEVEL, AND PREGNANCY STATUS ON 


WEIGHT-FOR-HEIGHT 


Ordinary Two-Stage 
least-squares léast-squares 
Variable* estimates? estimates‘ 
Calorie consumption” 0.0295 0.136 
(4.09) (3.37) 
Very active occupation 0.0859 :— 0.0119 
~ (5.34) (0.23) 
Exceptionally active . 0.0668 — 0.0817 
occupation” (3.43) , (1.26) 
Pregnant? * 0.262 0.326 
(7.69) (1.34) 
Lactating? 0.144 0.513 
(9.28). (4.65) 
Age 0.284 0.0987 
(16.6) (1.90) 
Age squared ~ — 0.00456 0.0174 
- (1.44) (2.37) 
Sex (male = 1) 0.00196 — 0.0578 
i . (0.08) (1.81) 
Age Xsex . 0.0152 0.0687 
: ‘(.74) (4.04 
Water drawn from — 0.0478 — 0.0406 
‘tube well (3.13) (2.10) 
Water drawn from well — 0.0720 — 0.0693 
l . (4.11) (3.15) 
Water drawn from pond — 0.0460 ` ~ 0,0649 
(2.30) (2.55) 
Constant — 2.56 — 3.12 
(52.4) (13.9) 
N 5 1,737 1,737 
R? 0.775 = 
F , 395.1: -— 
Hp: No influence of ne ‘ T.29. 
calcium, carotene, 
thiamine, and 
riboflavin 
consumption? (F) : 
Ho: No difference in coe 2.16 


effect of calorie 
consumption by 
sex (F) 


“All variables in logs, except sex, water sources, and 
activity level. 

Endogenous variable; instruments include house- 
hold head’s age and schooling level, landholdings, and 
prices of all foods consumed interacted with individual 
age and sex variables, land, and head’s schooling and 
age. 

“Asymptotic tf ratios in parentheses. 


four additional nutrients found by James 
Ryan et al. (1984) to be potentially impor- 
tant determinants of short-run health in a 
rural area of India—calcium, carotene, thi- 
amine, and riboflavin—do not influence 
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weight-for-height in our sample. (Fis, 1724, = 
1.23). The null hypothesis that the calorie- 
output elasticity is the same for men and 
women also could not be rejected (Fu, 1724 
= 2.16). Thus, the difference between sexes 
in the production of weight-for-height is 
sufficiently well specified as an age-depen- 
dent intercept shift. Except for the first 2.5 
years of life, Bangladeshi males are pre- 
dicted to have greater weight-for-height than 
females having identical levels of inputs. 


B. Endowments and 
Calorie Consumption 


Having obtained estimates of the health 
technology, we can compute health (weight- 
for-height) endowments for each individual 
based on actual calorie consumption and 
activity. In order to use the repeated-mea- 
sure methodology to mitigate the effects of 
errors in measurement, we use the longitu- 
dinal component of the sample, which pro- 
vides four rounds of data for 50 households 
in two of the villages Gorbaria and Falshat- 
tia). We also use as instruments estimated 
endowments of mid-arm circumference and 
skinfold thickness derived from production 
functions, estimated by two-stage least 
squares, containing the same regressors and 
instruments as the weight-for-height pro- 
duction function. Three instruments for an 
individual’s weight-for-height endowment in 
a period + are thus construcied: the esti- 
mated endowments of the three health at- 
tributes averaged over the survey rounds in 
which the individual was present excluding 
period 7. 


Formally, the set of instruments Z/ associated 
with the endowment of type j for individual i in period 
7 is constructed as 


where j = weight-for-height, skinfold thickness, or mid- 
arm circumference, and where 7; is the number of 
repeated measures (rounds) available for person i. 
Instruments for the mean weight-for-height endow- 
ments of groups (classes) of family members in period 
T are constructed as the group means of the 
individual-specific means Z/.. 
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The household welfare-maximization mod- 
el, as noted, implies that the calories allo- 
cated to an individual in the household de- 
pend on that person’s characteristics (age, 
sex, and endowment), the characteristics of 
all other household members, and house- 
hold or village-specific characteristics such 
as health-program availability and food 
prices. With respect to village-level vari- 
ables, because our longitudinal sample is 
taken from only two villages, a village 
dummy variable captures all village-specific 
determinants. To summarize parsimoniously 
the intrahousehold distribution of the ex- 
ogenous characteristics of household mem- 
bers, we computed the household means of 
those variables, namely mean age, mean age 
squared (variance of ages), proportion of 
household members male, and the mean of 
the household’s endowments. Household- 
specific variables include water sources and 
family income. Family income is treated as 
an endogenous variable because wages are 
assumed to depend on endowments, calorie 
allocations, and the level of effort. 

The first column of Table 4 provides 
two-stage generalized least-squares esti- 
mates of the logarithmic calorie-allocation 
equation, estimated with the full sample of 
individuals from the two villages, but in 
which instruments are not used for the en- 
dowment variables. The second column pro- 
vides parameter estimates that use the in- 
struments for the endowment variables. As 
predicted, the uninstrumented coefficient 
estimate for own endowment is algebraically 
less than the (positively signed) instru- 
mented own endowment coefficient esti- 
mate. Indeed, it is of opposite sign, indicat- 
ing compensation when there 1s evidently 
net reinforcement with respect to calories.'* 
The uninstrumented family or cross-endow- 


The coefficient on own endowment in these loga- 
rithmic calorie-allocation equations should be inter- 
preted as the elasticity of own health with respect to 
own endowment conditional on mean family endow- 
ments remaining fixed. This elasticity then corresponds 
to the experiment in which a transfer of endowment 
occurs within the household that leaves mean endow- 
ments unchanged. This same interpretation also ap- 
plies to the own age and sex coefficients. 
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TABLE 4—~TWo-STAGE GENERALIZED LEAST-SQUARES ESTIMATES: 
EFFECTS OF PERSONAL AND FAMILY CHARACTERISTICS ON THE 
ALLOCATION OF PERSONAL CALORIE CONSUMPTION: 


: Individual weight /height . 
endowment 


Family endowment , 


Family endowment, males 


` Family endowment, females 


a 


` Age square 


Family income 


Age 


« 


cd ` 


Sex (male = 1) - 


Age X sex - 


Mean age of family members 


Variance of ages of family ° 


members 
Proportion of family members 


male 


‘Water drawn from tube well 


Jorbaria village 


Constant 


X? (no family error: 


component)? 


Share of family error 
component variance 


in total error variance 


Two-stage least-squares estimates” 


All family members 


No instruments 
for endowments 


— 0.145 
(1.98) 
~ 0.867 
(1.01) 


Instruments for 
endowments? 


0.1327 


(1.33) 
-1.15 
(0.75) 


By sex 
Males Females 
0.676 - 0.0662 
(4.14) (0.26) 

— 0.743 —0.414 
(1.72 (0.27 
—0.325 —0.0709 
(1.10) (0.20) 
0.0839 0.0961 
(1.98) (2.07) 
1.44 1.33 

(17.5) . 05.7) 
— 0.200, . — 0.196 
(12.9) (11.6) 
- 0.00318 — 0.122 
(0.03) (1.44) 
— 0.0837 —0.120 
(0.99 (61) 
-0.00762 — 0.0777 
(0.04) (0.55) 
0.162 0.271 
(1.58) (3.29) 
0.196 0.283 ` 
(1.95) ‘-G.54) 
4,52 4.82 
(10.3) (13.2) 
407 371 ` 
129.0 38.06 
0.234 0.218 


“All variables in logs, except sex, water source, location, and sex ratio. 
Asymptotic ¢ ratios in parentheses. 
“Instruments for income and endowments are: household landholdings and house- 
hold head’s schooling and age; and means of individual and family endowments for 
weight/height, skinfold thickness, and arm circumference calculated over all survey 


rounds, excluding the round from which observation is drawn. 


Endogenous variable. 


“Lagrange multiplier (Breusch-Pagan test). 
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ment parameter is algebraically greater than 
the consistent estimate, but as the consis- 
tent estimate is negative, the sign of the bias 
could not be predicted unambiguously a pri- 
ori. A comparison of the first two columns 
in Table 4 also reveals that other parame- 
ters are biased substantially as well. 

The consistent estimate of the own en- 
dowment effect in column 2 of Table 4 
suggests that there is reinforcement with 
respect to calories, although the coefficient 
is not statistically different from zero at 
standard levels of significance (t= 1.32). 
However, the activity distributions reported 
in Table 2 indicate that there are important 
gender differences in activities, and thus, as 
our framework suggests, endowment effects 
may differ by gender. In columns 3 and 4 of 
Table 4, we provide two-stage generalized 
least-squares estimates of calorie-allocation 
equations stratified by sex (with all endow- 
ments instrumented). The estimates indi- 
cate that a 10-percent increase in a male’s 
endowment increases his calorie allocation 
by 6.8 percent; the own endowment effect 
for females is one-tenth that of males. These 
differences are consistent with the theory, 
given the lack of participation by women in 
energy-intensive activities and the findings 
in similar settings that health matters for 
the wages of men but not women. 

Corresponding to the positive and signif- 
icant own endowment effect for males, the 
cross effect of the endowment cf other males 
in the household is negative. These results 
thus reject the pure income-maximizing 
model, since if households allocate calories 
and effort so as to maximize income, all 
cross effects will be zero. The theory also 
predicts that, if there is calorie reinforce- 
ment, the effect of an increase in a female’s 
health endowment on the calories allocated 
to others in the household should be less in 
absolute value than the effect of an increase 
in a male’s health endowment, if health 
status is less important for women in their 


‘The reduction in total sample size that occurs 
when the sample is stratified by sex results from the 
necessity of using only those households that have both 
males and females in order to estimate gender-specific 
cross and own endowment effects. 
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activities. In both the male and female calo- 
rie-allocation equations of Table 4 (columns 
3 and 4), the effect of the mean endowment 
of females on calorie consumption is indeed 
considerably less in absolute value than the 
effect of the mean endowment of males, 
although the difference is not statistically 
significant because of the imprecision with 
which both endowment effects are mea- 
sured. 

The parameter estimates reported in 
Table 4 may specify cross effects in an im- 
perfect manner by assuming that they can 
be represented by the means (and higher 
moments) of household distributions, al- 
though adding second- and third-order mo- 
ments did not significantly improve the fit of 
these equations. A household fixed-effects 
estimator, however, provides an estimate of 
own endowment effects that requires no 
assumptions about the parameterization of 
the household variables, because the full set 
of cross terms and household-specific re- 
gressors are impounded in the fixed effect. 
Although this approach is more likely to 
avoid specification error, it, of course, pre- 
vents identification of the parameters asso- 
ciated with family endowments and other 
household-specific regressors. 

Table 5 reports household fixed-effects 
two-stage generalized least-squares esti- 
mates, by gender, of the effects of personal 
characteristics on individual calorie con- 
sumption. The set of instruments for the 
own endowment measure remains the same. 
The estimation procedure takes into ac- 
count the sample property that individuals 
appear more than once. The null hypothesis 
of no individual-specific error components 
is indeed rejected by the Breusch-Pagan 
(Lagrange multiplier) test statistic in each 
equation estimated.’° 


tóNote that only household (and not individual) 
random effects were specified in estimating the calorie- 
allocation equations of Table 4. The Breusch-Pagan 
(Trevor S. Breusch and Adrian Pagan, 1980) statistics 
of Table 5 confirm the importance of individual effects 
even when controlling for household fixed effects. The 
parameter estimates of Table 4 are nonetheless consis- 
tent, but standard errors are underestimated by about 
10 percent, based on our experience in obtaining the 
estimates reported in Table 5. 
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TABLE 5—Fixep-Errecrs Two-STAGE GENERALIZED Least SQUARES: 
Errectrs OF PERSONAL CHARACTERISTICS ON 


INDIVIDUAL CALORIE CONSUMPTION 


Two-stage least-squares estimates” 
Males 
Endowment Endowment Endowment Endowment 


Females 


effects vary effects effects vary 
with age constant with age 
— _ — 0.0278 — 
(0.15) 

— 0.435 — — 0.314 
(1.35) (0.46) 
0.923 on 1.86 
(2.29) (2.13) 

1.21 _ 0.0894 
(2.69) (0.13) 
1.31 1.34 1.35 

(14.9) (18.1) (17.9) 
—0.170 «© ~0.199 — 0.206 
(9.16) (13.4) (13.7) 
429 371 371 
48.35 32.36 26.17: - 
0.300 0.258 0.282 


; effects 
Variable? constant 
Own endowment* 0.447 

(3.58) 
Age < 6° — 

- 6 <age<12° a 
Age > 12° oon 
Age 1.44 

(22.9) 
Age squared — 0.201 
(16.7) 
N 429 
X? (no individual! error 46.5 
components) 
Individual error variance /total 0.287 


error variance 


“All variables in logs. 
Asymptotic t ratios in parentheses, 


“Instrumental variables used are means of individual and family endowments for 
weight /height, skinfold thickness, and arm circumference calculated over all survey 
rounds, excluding the round from which observation is drawn. 


Columns I and 3 of Table 5 report the 
within-household, gender-specific (logarith- 
mic) calorie-allocation equations having own 
endowment, own age, and own age squared 
as regressors. The parameter estimates di- 
verge very little from those reported in Table 
4. The elasticity of calorie consumption with 
respect to own health endowment is 0.447 
(t = 3.58) for males, indicating reinforce- 
ment, and is only — 0.028 (t= —0.15) for 
females. 

In columns 2 and 4 of Table 5, we report 
estimates obtained using specifications of 
the calorie equation in which sex-specific 
own endowment effects are allowed to vary 
across the three age groups that appear 
from Table 2 to be related to the differen- 
tiation of activity patterns. The pattern of 
estimated own endowment effects matches 
up well with the pattern of activities pre- 
sented in Table 2. Both male and female 


£ 


young children (aged less than six years) 
have the (algebraically) smallest own en- 
dowment effects. There is no labor-market 
return to higher endowments for these fam- 
ily members, and thus calorie compensation 
dominates; part of the better health derived 
from a higher endowment is “taxed”? away 
by the household, in- this case solely via the 
allocation of foods. 

Male and female children aged 6-12 years 
evidently engage in a more diverse set of 
activities ranked by energy intensity, and 
the own endowment parameters exhibit re- 
inforcement and are statistically significant 
for both males and females. A 10-percent 
increase in the health endowment of a 
6-12-year-old child increases calorie con- 
sumption by 9.2 percent if the child is male 
and 18.6 percent if the child is female. The 
higher rate of reinforcement for girls in this 
age group is consistent with their greater 
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diversity of activities, categorized by energy 
intensity, displayed in Table 2. Adult males 
exhibit the greatest diversity of activity 
choice ranked by energy intensity among all 
age-sex groups, while adult females have 
very limited diversity and are concentrated 
in less energy-intensive activities. Reflecting 
this, the estimated own endowment elastic- 
ity of calorie consumption for adult males is 
positive, statistically significant, and the 
largest of all groups (1.21), while that for 
adult females is close to zero (0.09). 

If 8, is the estimated own endowment 
effect for age-sex group k, then the variabil- 
ity in consumption in group k depends on 
B, and on the group’s dispersion in endow- 
ments, as Var(c“)= B? Var(u*). Based on 
our estimates of the endowments, we can- 
not reject the hypothesis that all within- 
group endowment variances are equal. Our 
estimates of B, values thus imply that 
household allocation rules and the variation 
in the effects of health on productivity across 
activities are in part responsible for the 
higher variability in intrahousehold calorie 
allocations among males relative to females 
for individuals aged 12 and over. These 
factors also contribute to the variability 
among girls and boys for those household 
members aged between 6 and 12 years, with 
no effect among boys and girls less than 6." 


C. Endowments, Family Income, 
and Activity Participation: 
Household Discrimination 


Although the results thus far are consis- 
tent with there being a return to health in 
the labor market, it remains to demonstrate 
with these data that income is positively 
associated with health endowments and that 
individuals with higher endowments are 
more likely to choose activities with a greater 


"To test for seasonality in endowment effects, we 
tested whether endowment responses varied with in- 
come by interacting the endowment end age variables 
with household income using the household fixed- 
effects procedure. We could not reject the hypothesis 
that endowment and age effects were independent of 
income levels. 
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energy intensity of effort, as implied by the 
theory. Moreover, with estimates of endow- 
ment effects on energy intensity, we can use 
(11) to compute how the household on net 
responds to differences in endowments for 
each gender. While the Bangladesh data do 
not provide information on individual- 
specific wage rates or earnings, we can test 
whether households with higher average 
health endowments among adult males have 
higher incomes, given land resources and 
schooling. Table 6 (column 1) provides esti- 
mates of the determinants of (log) per capita 
income. These show that income is posi- 
tively and significantly associated with the 
average endowments of males older than 12 
years of age, but as expected, the adult 
female endowment elasticity of income is 
only one-sixth as large as the male endow- 
ment elasticity and not statistically different 
from zero. 

Table 6 (column 2) also reports maxi- 
mum-likelihood (ML) instrumental-variable 
estimates of a probit activity-choice equa- 
tion for individuals aged 12-60.’° The di- 
chotomous dependent variable in this equa- 
tion has the value of 1 if an adult is engaged 
in an exceptionally active occupation (the 
only activity category that substantially re- 
duced weight-for-height in the estimated 
production function; Table 3) and 0 other- 
wise. Here, own endowment has a positive 
and statistically significant (at the 10-per- 
cent level) effect on the probability of par- 
ticipating in an exceptionally active activity. 
In addition, consistent with the calorie-allo- 
cation estimates of Table 4, the male family 
endowment has a large negative influence 
on this probability, five times larger than 
the influence of the female family endow- 
ment. The coefficient on sex (male = 1) is 
positive and statistically significant, reflect- 
ing the differences between sexes in the 
diversity of occupations, given endowments. 
Thus, the results reported in Table 6 con- 
firm that there is a pecuniary return to 
health and effort, that adult males with 


The likelihood maximized is given in Richard J. 
Smith and Richard W. Blundell (1986). 
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TABLE 6——DETERMINANTS OF THE LOG OF PER CAPITA HOUSEHOLD INCOME AND 
PROBABILITY OF PARTICIPATING IN AN “EXCEPTIONALLY ACTIVE” ` 
OCCUPATION AMONG PERSONS AGED 12-60 YEARS 


Exceptionally active 


occupation 
l Per capita income (full-information 
Variable (two-stage least-squares)? ML IV probit)? 
Own endowment” — 13.9 
i (1.64) 
Family endowment 2.38 — 16.74 
males > 12 years old? (2.86)° (2.29) 
Family endowment 0.378 ~— 3.67 
- females > 12 years old’ (0.75) (1.25) 
Age — 1.28 
(1.06) 
Sex _ — 6.92 
(2.36) 
Landholding 0.0200 — 0.0219 
i (0.64) (2.46) 
Household head’s schooling 0.109 ~ 1.09 
(1.80) (1.54) 
Mean age of family members 0.0444 — 1.58 
(0.14) (1.18) 
Variance of ages of family members 0.591 =5.55 
; ' (1.91) (2.50) 
Proportion of family members male 0.566 —4.11 
(1.09) (1.82) 
Jorbaria village —0.199 — 5.98 
(1.30) (2.19) 
Constant 4.23 4.95 
(4.11) (1.17) 
N` 45 153 
F 3.73 — 
Xiz — 76.2 


“Asymptotic £ ratios in parentheses. 
instrumented. 


„I 


higher endowments are more likely to un- 
dertake exceptionally energy-intensive work, 
and that adult female health endowments 
are relatively unimportant in determining 
activity choices or household income com- 
pared to adult male endowments. 

Finally, the net effect of a change in own 
endowment on own health [eq. (11)] can be 
calculated from the estimates of the health 
technology in Table 3 and the estimated 
endowment effects on calories and activities 
in Tables 5 and 6. For both adult males and 
females (aged 12 years and above), our esti- 
mates indicate that, in addition to its direct 
effect on health, an increase in endowment 
tends to increase health by increasing calo- 
rie consumption and to reduce health by 


inducing greater intensity of effort. The lat- 
ter indirect effect dominates the former for 
both sexes. The elasticity of own health 
with respect to own endowment is 0.88 for 
adult males and 0.97 for adult females.!? 


The elasticity is given by 
dinh/dinx=1+(élnh/dlnc)(dInc/d inp) 
+(dlnh /de)(de/dinp). 


The health elasticity of calories, the first parenthetical 
term in the elasticity expression, is 0.136 for both males 
and females (Table 1). The elasticity of calorie con- 
sumption. with respect to- own health endowment 
(dinc/d ny) is 1.21 for adult males and 0.089 for 
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Bangladesh households thus exhibit com- 
pensatory behavior with respect to health. 
Moreover, as the difference between the 
endowment elasticity and unity can be 
thought of as a “tax” levied by the house- 
hold on the exogenous health of its mem- 
bers, our estimates indicate that the exoge- 
nous health of adult males is taxed at a 
higher rate than the exogenous health of 
adult females (12 percent vs. 3 percent). 


IV. Conclusion 


In this paper, we have examined the de- 
terminants of calorie consumption and ac- 
tivity choices from the perspective of a 
model of intrahousehold allocation that in- 
corporates individual heterogeneity in ex- 
ogenous healthiness and differences in la- 
bor-market returns to health and effort 
across groups of individuals. The empirical 
analysis was applied to individual and 
household-level data from Bangladesh, a 
country that exhibits large differences in 
calorie consumption and in the energy in- 
tensity of activity by age and gender. Our 
results reveal that energy-intensive effort 
tends to reduce health as measured by 
weight-for-height, that there is a pecuniary 
return to health and effort, and that there is 
substantial calorie reinforcement for those 
classes of individuals best able to alter the 
energy intensity of effort. In particular, adult 
males (aged 12 years and above) and male 
and female children (aged 6-12) were found 
to receive calorie reinforcement with re- 
spect to their health endowments. These 
classes of individuals were also those ex- 
hibiting the most diverse activity choices 
ranked by energy intensity. Thus, linkages 


adult females (Table 5). The effect of participation in 
an exceptionally active occupation on (log) health 
(aln h /de) is — 0.082 for both males and females (Ta- 
ble 3). The estimated effect of (log) own health endow- 
ment on the probability of participating ir an excep- 
tionally active occupation is 13.9 (Table 6) multiplied 
by the standard normal density function evaluated at 
the value of the underlying latent activity variables for 
adult males and females, which are 0.247 and 0.038, 
respectively. 
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between health levels and productivity, 
combined with the circumscribed activities 
of adult women in Bangladesh, appear to 
account for part of the disparities in the 
average consumption of nutrients across 
adult men and women and to contribute to 
the greater variability among men in nutri- 
ent consumption. 

Our results also reject the income-maxi- 
mizing model of the household in favor of a 
model in which households exhibit some 
aversion to inequality. Indeed, even though 
the rate of calorie reinforcement for adult 
males was quite high (1.21 in elasticity) and 
almost zero for adult females, the greater 
likelihood of adult males with higher en- 
dowments to undertake exceptionally en- 
ergy-intensive work resulted in a “tax” on 
adult male endowments that exceeded that 
of adult females (12 percent vs. 3 percent), 
signaling some discrimination against males 
by the household. 

Our evidence that disparities by gender in 
food consumption in a low-income society 
like Bangladesh reflect the gender differen- 
tiation in the energy intensities of activities 
suggests that increases in labor-force oppor- 
tunities for women, ceteris paribus, will likely 
increase the calories allocated to women. 
However, as we have shown, the health and 
welfare benefits of such an increase in calo- 
rie consumption by women will be tempered 
by the increased level of energy-intensive 
activity associated with greater calorie con- 
sumption. Furthermore, while an increase 
in the occupational diversity of women is 
likely to reduce (calorie) consumption in- 
equality between the sexes, it will increase 
inequality among adult women and thus may 
increase overall inequality in consumption 
and health. The increase in inequality 
among women will reflect the increased im- 
portance of the distribution of endowments 
in determining the distribution of calories 
when there is a greater return to effort and 
health in the labor market. However, to the 
extent that economic development is char- 
acterized by a transformation of work activi- 
ties to those in which linkages between food 
consumption and productivity are weak, 
overall inequality in food consumption may 
be attenuated for all groups as incomes rise. 
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A Social Exchange Approach to Voluntary Cooperation 


By Herz HOLLÄNDER” 


A social exchange approach to voluntary cooperation is developed on the 
assumption that voluntary cooperative behavior is motivated by social approval, 
which is conceptualized as an emotional activity. The associated unique Nash 
equilibrium may have attractive welfare properties and provides an understanding 
of spontaneous norm emergence. Furthermore, the opening of a market or 
government intervention for the collective good is shown to affect voluntary 


cooperation negatively. (JEL 024, 025) 


This article deals with the problem of 


voluntary cooperation for a pure collective: 


good when the impact of feasible individual 
contributions on its supply can be ne- 
glected. According to standard theory, ra- 
tional agents will entirely fail to caoperate 
because they are caught in a Prisoner’s 
Dilemma. However, as many have observed, 
the empirical extent of cooperation is in 
marked contrast to the one expected from 
standard theory.’ Thus, there is need for a 
more sophisticated theory that attributes 
success or failure of cooperation to the cir- 
cumstances at hand. As is well known, sev- 
eral explanations of voluntary cooperation 
have been put forward,” but they seem to 


*Department of Economics and Social Sciences, 
University of Dortmund, 4600 Dortmund 50, Federal 
Republic of Germany. I thank Matthias Fischer, Jurgen 
Frank, Franz Haslinger, Wolfram Richter, and Joachim 
Weimann for discussion and/or helpful suggestions. I 
am also grateful to three anonymous referees for their 
comments. 

For instance, people join political parties, trade 
unions, the Red Cross, and other nonprofit organiza- 
tions; they turn out for elections, demonstrations, and 
strikes; many give considerable amounts to charity and 
make voluntary blood donations; and, finally, they of- 
ten help others, show consideration for others, line up, 
are honest, don’t cheat whenever they can get away 
with it, and comply with other behavicral norms. 

The more important ones are the by-product ap- 
proach of Mancur Olson (1965); the altruism approach 
of, for instance, Robert Schwartz (1970), Gary S. Becker 
(1974), David A. Collard (1978 Ch. 10), Kenneth J. 
Arrow (1981), and Howard Margolis (1982); the iter- 
ated game approach of, for instance, Peter Hammond 
(1975), Mordecai Kurz (1977), and Andrew Schotter 
(1981); and the sociobiological approach of, for in- 
Hae Edward O. Wilson (1975) and Robert Axelrod 
1984). 
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be exposed to severe criticism,’ or, as 
Olson’s (1965) by-product theory, can at best 
account for a small part of observed cooper- 
ative behavior. This paper offers a simple 
axiomatic model of social exchange in which 
cooperative behavior is motivated by the 
expectation of emotionally prompted social 
approval and explores some of its implica- 
tions. The basic hypothesis has a long tradi- 
tion. It was employed, for instance, by Bern- 
hard de Mandeville (1714) in his Fable of 
the Bees, by Adam Smith (1759) in his The- 
ory of Moral Sentiments, and in modern soci- 
ology, by George Caspar Homans (1961) in 
his Social Behavior. In order to demonstrate 
the logic of the social exchange approach as 
clearly as possible, I have chosen simple and 
restrictive but, I hope, nevertheless plausi- 
ble assumptions supporting the main re- 
sults. The contribution is intended to be an 
exemplary argument rather than a general 
theory. 

In the model, individual cooperative con- 
tributions and, thus, the supply of the col- 
lective good depend on the strength of the 
approval incentive, while the latter depends 
on individual contributions and the supply 
of the collective good. I show the existence 
of a unique symmetric Nash equilibrium 
and give a full characterization of the equi- 
librium solution. Furthermore, it is shown 
that the model provides a micro-founded 


For instance, the altruism approach has been criti- 
cized by Robert Sugden (1982, 1984), the iterated-game 
approach has been criticized by Dennis C. Mueller 
(1986), and the sociobiological approach has been criti- 
cized by Becker (1976) and Philip Kitcher (1985). 


1158 THE AMERICAN ECONOMIC REVIEW 


invisible-hand explanation of the emergence 
of a behavioral norm. In general, the social 
exchange allocation is not Pareto-efficient. 
However, compared to the optimal planning 
allocation and the hypothetically ideal mar- 
ket allocation without approval incentives, it 
may provide more of the collective good 
and higher group welfare. A further result 
confirms the argument of Fred Hirsch (1976) 
that the opening of a market for the collec- 
tive good supersedes voluntary contribu- 
tions, at least partly, and possibly decreases 
social welfare. 

In Section I, the model setting is devel- 
oped. Section IJ contains a formal descrip- 
tion of rational and emotional behavior, and 
it explores the consequences of consistent 
social interaction. The relationship between 
the sociological concept of a behavioral 
norm and the social exchange equilibrium ts 
discussed in Section III. The welfare analy- 
sis is done in Section IV, and Section V 
offers a few concluding remarks. 


I. The Model Setting 


The model proceeds from a group of n 
identical agents. Initially, every agent is pro- 
vided with 7 units of a private good. If b 
units are contributed to the collective good, 
the remaining units, p, are privately con- 
sumed: 


(1) p=7—-b, O<b<z7. 


The collective good is produced by means 
of aggregate forgone private consumption 
alone. For simplicity, I assume constant re- 
turns to scale with costs proportional to 
group size. Thus, the amount produced is 


(2) c=—) bj 


Furthermore, it is assumed that the group 
is large in the sense that the effect of every 
feasible individual contribution on collective 
good supply can be neglected. 

A cooperative contribution satisfies the 
following three conditions that characterize 
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a pure gift: first, it confers some benefit 
upon others; second, it imposes net costs 
upon the cooperative agent; and finally, it is 
voluntary. This suggests that other group 
members will behave toward cooperative 
agents in essentially the same way as donees 
do toward donors. In general at least, recip- 
ients of gifts react with, for instance, grati- 
tude or sympathy for their benefactors. 
These reactions typically are not rationally 
calculated but, rather, prompted by the 
stimulus-response mechanism of the human 
emotional system. In analogy to the gift case 
and in accordance with Homans (1961), I 
proceed on the assumption that some sub- 
jective value of a cooperative contribution b 
as assessed by the reacting agent measures 
the stimulus power s(b) prompting emo- 
tional reactions. It should be noted that 
the introduction of emotional activities 
prompted by some stimulus takes us beyond 
rationally calculating economic man. 

Two categories of emotional reactions can 
be distinguished. The first comprises emo- 
tions, feelings, and attitudes as inner states. 
To the second belong all activities associ- 
ated with these inner states such as, for 
instance, facial expressions, verbal expres- 
sions of gratitude, and even killing in the 
heat of passion. These activities can be re- 
garded as expressions of inner states. In 
accordance with Homans (1961) and Smith 
(1759), an emotional reaction consisting of 
an inner state and its particular expression 
is called a sentiment. 

There seem to be quite a number of 
different classes of sentiments respectively 
characterized by the extremes sympathy and 
antipathy, love and hate, gratitude and re- 
sentment, joy and grief, pride and shame, 
admiration and contempt, and presumably 
some others. Within each class, the ex- 
tremes are generally understood as ex- 
tremely “positive” or extremely “negative” 
sentiments, and at least in principle, people 
are able to order the sentiments of a partic- 
ular class according to their “positivity.” 
Webster’s Dictionary describes the meaning 
of positive in this context as “marked by 
acceptance or approval” and “indicating 
agreement or affirmation.” Thus, one can 
replace “positive” and “negative” by “ap- 
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proving” and “disapproving,” respectively. 
All agents are assumed to order the senti- 
ments of any particular class by the same 
reflexive, transitive, and complete relation 
“is at least as approving as.” 

A complex of sentiments consisting of 
exactly one sentiment form each class is 
called a sentiment bundle. An agent’s emo- 
tional reaction pattern is described by the 
response function f that, for all real stimu- 
lus values s, assigns a sentiment bundle f(s) 
to s.t The response function is assumed to 
be continuous and monotonically increasing 
in the following sense: For all s!> 5°, all 
components of f(s‘) are at least as approv- 
ing as the corresponding components of 
f(s®°), and at least one is more approving. 
The more valuable the behavior of others is 
to the reacting agent, the more approving is 
the sentiment bundle prompted. Then, f~! 
is also monotonically increasing, since it 
maps more approving sentiment bundles to 
higher stimulus values. This means that f~! 
is an ordinal approval scale. Hence, the 
stimulus value s prompting the sentiment 
bundle f(s) measures the approval associ- 
ated with f(s). 

An agent can obtain approval only from 
those who come to know his behavior and 
communicate their feelings to him. It is 
assumed that these requirements are met 
exactly by those with whom the agent regu- 
larly keeps company: his kin, friends, ac- 
quaintances, neighbors, etc. This subgroup 
is called the reference group of the respec- 
tive agent and is assumed to be of equal size 
for all agents. As the model is concerned 
with symmetric behavior only, the amount 
of approval obtained from a typical member 
of the reference group can be used as an 
index of total approval obtained. Thus, s(b) 
is the total amount of approval received in 
return for a cooperative contribution b. 
Furthermore, agents are assumed to know 


“To treat the response function as exogenous does 
not mean that it is determined exclusively biologically. 
Certainly, there is some cultural and educational in; 
fluence. Nevertheless, emotions are essentially not sub- 
ject to rational control but, rather, are autonomously 
triggered by the hypothalamus. 
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the sentiments excited by their behavior? so 
that, for theoretical purposes, they can be 
treated as if they knew s(b). 

All agents are assumed to have identical 
preferences for the private good, the collec- 
tive good, and social approval. With respect 
to approval preferences, several aspects have 
to be distinguished. First, it seems to be 
part of our genetical hardwiring that we are 
interested in being the objects of others’ 
positive emotions. This means a preference 
for social approval in the sense of inner 
states. Second, we also have a preference 
for the way emotions are expressed. We are 
not indifferent between, for instance, an 
invitation for dinner and a bodily attack. 
One can, however, reasonably assume that 
the expressive activity is the more favorable 
the more approving the respective senti- 
ment is. This means that the preference for 
sentiments as combinations of inner emo- 
tional states and expressive activities can be 
represented aS a preference for approval. 
Third, often people not only want to be 
loved and admired but also to be loved and 
admired more than others.® I take this into 
account by assuming a preference for com- 
parative approval s(b)— s(c), where s(c) in- 
dicates the approval associated with average 
behavior. For convenience, it is assumed 
that absolute and comparative approval can 
be aggregated into the weighted average 
(1— a)s(b)+ als(b)— s(c)]. Thus, the value 
of the relevant approval variable is 


(3) a=s(b)—as(c), 


Preferences are represented by a utility 
function 


(4) u=u,(p)t+u,(c)+u,(4) 


O<a<il. 


that satisfies the conventional assumptions 


‘This may be justified because of sufficient experi- 
ence or, as Adam Smith (1759 pp. 9-13) argued, be- 
cause of “sympathy,” the ability to enter into the 
feelings of others if only the stimulating situation is 
known. 

°The assumption for instance is employed by 
Mandeville, Smith, and Homans. Robert H. Frank 
(1985) gives a detailed argument for status orientation. 
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of monotonicity and concavity and, for sim- 
plicity’s sake, implies essentiality of the pri- 
vate good (u/(0) =), as well as an absolute 
elasticity of ul smaller than one. 


II. Individual Behavior and Social Interaction 
A. Individual Behavior 


The stimulus power s(b) that measures 
approval was conceptualized as some sub- 
jective value of the cooperative contribution 
as assigned by the approving agent. The 
subjective value of a unit contribution, w, is 
called the “approval rate.” In the gift case, 
the gratitude toward a donor apparently 
does not only depend on the absolute sub- 
jective value of the gift but also upon how 
this value compares with the values of simi- 
lar gifts. Analogously, I assume that the 
effective subjective value that prompts ap- 
proval is a.weighted average of the absol- 
ute value wb and the comparative value 
w(b—c). The corresponding weights are 
1— B and B so that 


(5) s(b)=w(b-Bc), O<f<1. 


Substituting for s(b) and s(c) from (5) 
into (3), one obtains 


(6) a=w(b~-ac), 
O<co=at+B-aB <1. 


The coefficient o indicates the strength 
of the negative externality emanating from 
the average contribution. For simplicity of 
language, I treat o as a measure of “status 
orientation,” although this apparently is only 
partly correct. 

A typical agent is confronted with the 
respective behavior of others, characterized 
by w and c, which is beyond his influence. 
He responds to this behavior by some ratio- 
nally chosen contribution, b, and some 
emotionally determined approval rate, v, 
which he applies to other agents’ contribu- 
tions. The agent’s optimal contribution max- 
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imizes 
(7) u=u,(7-—b)+u,(c) 


+u,|w(b-oc)], O<b<7 
for given w and c with respect to b. The 
following necessary (and sufficient) condi- 
tion characterizes an optimum:’ 


(8) ul(ar—b)2wul[w(b-oc)| 
| and equality if b > 0. 


Exploiting the assumed properties of the 
utility function, I routinely obtain the fol- 
lowing results from (8): 


PROPOSITION 1: Optimization defines an 
individual contribution function b(w,oc,7) 
with (i) b>0 if and only if wu! (r) < 
wui(—woc) and (ii) by, dee, by >0 and 
beoc bg <1 for all b > 0. 


Condition (i) means that the material in- 
centive for cooperation, w, must be suffi- 
ciently large in order to induce cooperative 
behavior. Formally, the restrictions on bisc) 
and b, are due only to the normalcy of both 
the private good and approval, whereas b,, 
>0 results from a dominant substitution 
effect. It is remarkable that individual con- 
tributions are related positively to the de- 
gree of status orientation as well as to other 
agents’ contributions, provided o and c are 
positive. Increased status orientation makes 
“keeping up with the Joneses” more impor- 
tant, and increased contributions of others 
induce efforts to regain at least some of the 
status lost. Positively related individual con- 
tributions are also experimentally observed 
by James H. Bryan and Mary Ann Test 
(1967). 

In order to develop a hypothesis about 
the individual approval rate v, I start with 
the following problem: if individual contri- 
butions have only negligible effects on the 
supply of the collective good and, therefore, 


7Because of u 1(0) =, the case u, < wu at b =r 
is impossible. 
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on other agents’ well-being, why then should 
anybody regard these activities as valuable? 
Actually, we generally approve of coopera- 
tive behavior even if it does not make us 
significantly better off. In doing so, we often 
seem to consider the hypothetical advantage 
we would enjoy if everybody else behaved 
cooperatively in like manner. This motivates 
the assumption that an agent’s approval rate, 
his subjective value of another agent’s 
marginal contribution stimulating approval, 
is equal to the hypothetical advantage, mea- 
sured in terms of the private good, that the 
former agent would enjoy if not only the 
latter but also all other agents except him 
increased their contributions marginally. 
Formally, this means that v is taken to be 
the marginal rate of substitution between m 
and c with respect to (7): 


_ u(c)— owu,lw(b — oc)] 
Oh u(t ~ b) 


Thus, the individual approval rate is the 
ageregate value of two externalities, the 
positive collective good externality and the 
negative status externality. This seems to be 
supported by the casual observation that 
people sometimes do not unambiguously 
approve of contributions to the collective 
good because, in their opinion, the contrib- 
utors try to distinguish themselves. For a 
full understanding of the role of status ori- 
entation, it should be noted that, on the one 
hand, a high o furthers cooperation for a 
given incentive w (compare Proposition 1), 
but on the other hand, it adversely affects 
cooperation by weakening incentives. 


B. Social Equilibrium 


A BC equilibrium is an average contribu- 
tion consistent with individual contributions 
in the sense that c = b(w,ac,7r). Substitut- 
ing b=c into (8) and again exploiting the 
properties of the utility function, one ob- 
tains: 


PROPOSITION 2: In BC equilibrium, indi- 


vidual contributions and supply of the collec- 
tive good are a function c(w,o,77) with (i) 


rr enn 
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c > 0 if and only if w > ub(r)/ui(0) and (ii) 
CwsC,,C, > O for all c > 0. 


By and large, c(w,o,7) behaves like 
b(w, oc,ar). It can be shown that social in- 
teraction reinforces the effects of paramet- 
ric variations. 

A VW equilibrium is an approval rate 
consistent with individual behavior in the 
sense that v =w. A simultaneous BC and 
VW equilibrium is called a social exchange 
equilibrium. Substituting from (8) into (9) 
and observing b =c and v = w, one obtains 
the following VW condition for a social ex- 
change equilibrium: 


ut(0) 
——_——— ifc=0 
(10) E ul(ar) + ou, (0) 
"TN ut(e) 
an e eS. 
ul(ar—c) 


Figure 1 demonstrates the existence of a 
unique social equilibrium (w*,c*). The BC 
curve is positively sloped (compare Proposi- 
tion 2), whereas concavity of the utility 
function makes the VW curve negatively 
sloped. If, for all positive levels of coopera- 
tion, the approval rate sustainable at some 
level of symmetric cooperation is below the 
one required for supporting the respective 
level of cooperation, that is, if the VW curve 
is below the BC curve for all c>0, one 
obtains c* = 0. It is easily shown that in this 
case, the approval rate resulting from (10) 
at c=() supports a BC equilibrium with 
c=. 

An increase in m shifts the BC curve to 
the right and the VW curve upward. Not 
surprisingly, higher endowments are associ- 
ated with a higher equilibrium level of co- 
operation. In general at least, the role of 
status orientation is unclear. An increasing 
ao also shifts the BC curve to the right 
except at c= 0, but the VW curve is shifted 
downward by Ao. The weakening of the 
incentive mechanism obviously dominates if 
c* is sufficiently small so that low equilib- 
rium levels of cooperation tend to be re- 
duced by increased status orientation. The 
following proposition collects the results. 
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FIGURE 1. SOCIAL EXCHANGE EQUILIBRIUM i 


PROPOSITION 3: For given m and a, there 
exists a unique social equilibrium (w*, c*) with 
(i) w* > 0, (ü) c* > 0 if and only if 


| HAO) _ usm) 
u(r) u,(0) 


and (ili) c* >0 and, provided c* is suffi- 
ciently small, c* < 0.8 


HI. An Invisible-Hand Explanation of 
Norm Emergence 


In sociological theory, three characteris- 
tics of valid behavioral norms seem to be 
uncontroversial. First. there is a standard of 
behavior shared by members of a group. 
The standard is positive in the sense that 
actual behavior conforms to the standard, at 
least on average, and normative in the sense 
that it expresses a shared value judgment as 


to how group members ought to behave. ` 


SIn addition, it is easily seen that increasing prefer- 
ences for approval and the collective good shift the BC 
curve to the right and the VW curve upward, respec- 
tively, whereas increasing preferences for the private 
good shift the BC curve to the left and the FW curve 
downward. Thus, if the private good is interpreted as 
available time and if Staffan B. Linder’s (1970) argu- 
ment of an increasing marginal utility of time due to 
increasing consumption possibilities is correct, people 
can be expected to behave more selfishly. 


Second, negative or positive individual devi- 
ations from the standard are punished or 
rewarded by negative or positive sanctions, 
respectively. Third, group members deter- 
mine their behavior on the basis of the 
existing standard ‘and anticipated sanctions. 
However, the problem of sociological theo- 
ries of norms is the lack of explicit micro- 
foundations. There is no explicit theory of 
sanction activities or of the relation be- 
tween sanctions and behavior toward the 
collective good, not to mention a theory of 
social interdependence: of individual activi- 
ties. As norms are unintended social results 
of individual actions, the lack of explicit 
microfoundations means that sociological 
theory has little to say as to why and how 
particular behavioral norms emerge or fall 
into decay. 

The model presented shows how social 
interaction brings about a standard of be- 
havior c* to which everybody conforms and 
how this standard relates to preferences and 
exogenous variables such as o and m. Asso- 
ciated with this standard is a “normal” 
amount of approval a* = (1 — o)w*c*. Sanc- 
tions are appropriately defined as approval 
deviations from normal approval that are- 
due to deviations of actual from standard 
behavior. Thus, deviant behavior b — c* in- 
duces a sanction of amount w*(b — c*). The 
existence of general sarictions also implies 
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that the behavioral standard reflects the 
normative expectations of group members 
with respect to behavior of others. If some- 
body is sanctioned negatively or positively, 
this apparently expresses the opinion of 
others that he deviated negatively or posi- 
tively from how he ought to behave. It 
should be clear, then, that the individual 
Optimization problem of the model can be 
reformulated equivalently in terms of the 
standard c, deviant behavior b —c, normal 
approval (1 — o)we, and sanctions w(b — c). 
Thus, the model provides an understanding 
of norm-regulated behavior. 

It is often assumed that norms are inter- 
nalized. For instance, Talcott Parsons (1951 
pp. 38-40) argues that a full institutional- 
ization of a behavioral standard requires its 
general internalization, by which he means 
its incorporation into the agents’ superego. 
According to psychoanalytic personality the- 
ory, the superego is a major part of the 
psyche that represents a person’s behavioral 
ideal in agreement with the standards of 
society and evaluates the correctness of 
one’s behavior from the point of view of this 
ideal. What is called conscience is held to 
be manifested by part of the superego. It is 
supposed to develop in response to advice, 
warnings, and punishment by parents and 
other socialization agents. Therefore, inter- 
nalization of norms essentially means the 
development of an internal system of sanc- 
tions that is structurally equivalent to the 
external system and, by implication, the 
adoption of external standards of behavior 
as own standards. The internal system of 
sanctions operates in terms of self-approval, 
which, analogously to external approval, 
measures the “positivity” of welfare-rele- 
vant emotional reactions toward one’s own 
behavior. The possibility of internalization 
is important to the domain of the model. 
Insofar as norms are internalized, it may 
help to understand norm conformity even in 
situations when the respective agent is un- 
observed.” A functional equivalent to intro- 


Robert Trivers (1971 p. 50) argues that humans 
could have evolved a conscience as a protection against 
unfavorable consequences of being caught defecting. 
Frank (1987) analyzes the problem of reliably signaling 
a conscience. 
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jection is the projection of the system of 
sanctions on an ommiscient and almighty 
god. In the first case, the group manifests 
itself as part of the individual, in the second 
as part of God. 


IV. Alternative Modes of Allocation and Welfare 


A. Pareto-Suboptimality of the 
Equilibrium Allocation 


The approval constraint (6) can be inter- 
preted as a household production function 
describing feasible transformation opportu- 
nities of private goods into approval. As the 
average contribution, c, affects approval 
generation, all “production functions” are 
interdependent. First, and not surprisingly, 
status orientation induces a negative exter- 
nality. A second externality affects the ap- 
proval rate through c [compare (9)]. Its sign 
seems to be unclear. Furthermore, as aver- 
age contribution and supply of the collective 
good coincide, there is also a positive exter- 
nality from production to collective good 
consumption. In general, these three exter- 
nalities will render the equilibrium alloca- 
tion inefficient. In order to establish an ef- 
ficient symmetric allocation, it is obviously 
necessary and sufficient to maximize utility 
as stated in (7) with respect to c and w 
subject to the symmetry conditions b=c 
and v=w, as well as the approval rate 
condition (9). In order to realize a first-best 
allocation systematically, agents would have 
to be committed to an ethical principle 
obliging them to adopt the above program. 
For example, the “Kantian” rule obliging 
everybody to make the minimal cooperative 
contribution that he wishes all others to 
make !° is obviously sufficient, provided they 
are well-informed about the structure of the 
model. However, self-interested rational 
agents apparently do not follow this rule. 
They end up in a social exchange equilib- 
rium that, as long as other modes of alloca- 
tion are ignored, is only second-best. 


This rule has been called “Kantian” by Jean- 
Jacques Laffont (1975) and Collard (1978). Others have 
named it “rational commitment” (John C. Harsanyi, 
1980) or “unconditional commitment” (Sugden, 1984). 
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~ B. Government and Market Allocation 
of the Collective Good 


Consider now an otherwise identical 
group without emotional incentives so that 
approval is zero for everybody. Let the gov- 
ernment supply an optimal amount of the 
collective good financed by enforced sym- 
metric tax payments. The optimal supply of 
the collective good, c°, maximizes utility 
subject to b=c and zero approval with re- 
spect to c. The following necessary and 
sufficient condition characterizes the opti- 
mum:"* 


(11) ut(c°) <ur — ce?) 
and equality if c° > 0. 


Of course, this ts the Samuelson condi- 
tion. If c° > 0, the individual willingness to 
pay, u, / up is-equal to 1 so that the aggre- 
gate willingness is equal to n, the marginal 
cost of the collective good. Thus, provided 
the group had invented some turnstile en- 
abling exclusion from consumption at negli- 
gible cost, the above allocation could also 
be realized as a market equilibrium. The 
price of one unit of the private good appar- 
ently equates supply and demand for the 
collective good at c°. 

One might perhaps expect that state or 
market provides more of the collective good 
than a system of social exchange. Surpris- 
ingly, this is not true a priori. Figure 2 
depicts the case c* >c? > 0. 

As only the BC curve depends on ap- 
proval preferences, one can apparently con- 
clude that, provided o <1, sufficiently 
strong desire for approval leads to c* > c°, 
(Note that extreme status orientation in- 
deed precludes c* > c°.) In the light of the 
model, the empirical finding of Titmuss 
(1971) that a system of voluntary blood do- 
nations provides not less transfusion blood 
than a blood market is not paradoxical 
at all. 

With respect to group welfare, the opti- 
mal allocation attainable through govern- 
ment intervention or an ideal market is not 


Again because of u‘(0)=, the case ui <u}, at 


0 == qr is impossible. 


c 
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FIGURE 2. ALTERNATIVE ALLOCATIONS 


superior to the Kantian allocation and is 
inferior in general. This is due to the fact 
that Kantian agents can contribute c° vol- 
untarily and thus additionally enjoy nonneg- 
ative approval. However, the following re- 
sult is remarkable. 


PROPOSITION 4: Consider an economy 
without emotional incentives, but otherwise 
identical. In the optimal allocation attainable 
through government intervention or an ideal 
market, group members are worse off than in 
the social exchange allocation if the latter 
provides more of the collective good than the 
former.” 


For: ‘Proof’s sake, consider welfare in BC equilib- 
ria. At (w®,c®) in Figure 2, social exchange welfare is 
not less than welfare in a state or market allocation 
because of nonnegative approval. To complete the 
argument, I only have to show that the transition from 
(w®,c®) to (w*,c*) along the BC curve increases wel- 
fare. An increase in w with c remaining constant 
increases approval (ø <1!) and, thus, welfare. Now, a 
marginal increase in the approval rate also increases 
everybody’s contribution and, therefore, c. The in- 
crease in contributions is welfare-relevant only through 
the status and the collective good externality because, 
owing to optimization, internalized utility effects sum 
to zero. The aggregate value of a marginal increase in 
both externalities, measured in terms of the private 
good, is given by the ordinate vajue of the VW curve. 
The latter.is positive for all c between c° and c*, 
which completes the proof. 
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A few remarks are in order. Modes of 
allocation that rely on voluntary contribu- 
tions to the collective good have an advan- 
tage over other modes. As, from the social 
point of view, approval is a free by-product 
of the collective good, the group, compared 
to other modes, uniformly obtains more 
outputs from the same inputs, provided the 
latter are positive and status orientation is 
not extreme. The advantage in production 
explains why social exchange, although stuck 
with externality problems, may do better 
than an ideal market or a central planner. 
Even if the supply of the collective good in 
social equilibrium falls short of c°, this may 
be more than compensated by a positive 
emotional climate within the group. 
Whether or not such a compensation actu- 
ally takes place strongly depends on the 
degree of status orientation. A group that 
relies on social exchange for provision of 
collective goods should try to reduce o by 
educating its children 1) not to seek status 
and prestige and 2) to reward others by 
positive sentiments for their cooperative 
contributions, however small they may be, 
and whatever third agents may contribute. 


C. Norms and Expanding Market Domain 


Hirsch (1976 Ch. 6) has put forward the 
interesting hypothesis that empirically ob- 
servable commercialization of collective 
goods has the following two related effects. 
First, market exchange tends to generalize 
itself in the sense that it considerably weak- 
ens or even completely destroys social norms 
demanding cooperation for the provision of 
the respective collective goods. Second, this 
supersession of norms by markets may well 
decrease social welfare. However, Hirsch 
would seem to have no explicit model of the 
interdependence of norms and markets so 
that it remains unclear whether or not his 
argument is conclusive. In the following, I 
show that my model lends support to the 
commercialization effects as put forward by 
Hirsch. 

Suppose that a new turnstile permits set- 
ting up a market for the collective good and 
that it is supplied infinitely elastically at 
the marginal-cost price of one unit of the 
private consumption good. If c*>c°, the 
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willingness to pay, owing to u/(c*) < 
u'(a —c*), is smaller than its marginal-cost 
price of 1 for any positive market demand, 
so that market trade is impossible. By con- 
trast, if c*<c°, every agent will demand 
c°—c* units of the collective good in the 
market, because its price is smaller than the 
initial willingness to pay. Thus, in the case 
c* <c", whatever may be supplied voluntar- 
ily of the collective good, its total equilib- 
rium supply must be c°? if social exchange is 
combined with market exchange. The same 
is obviously true if the government inter- 
venes for the collective good. The following 
result is proved in the Appendix. 


PROFOSITION 5: Consider an economy 
with c? > c* > 0. Then, the opening of a mar- 
ket, as well as government intervention for the 
collective good, reduces voluntary contribu- 
tions, social approval if o <1, and possibly 
also group welfare. 


This can be understood intuitively as fol- 
lows. In the new equilibrium, a larger part 
of the private good endowments is allocated 
to the collective good, so that the marginal 
utility of private good consumption is in- 
creased. This increase has two negative ef- 
fects. First, it induces people to reduce 
voluntary contributions for given approval 
incentives. Second, together with the de- 
crease in the marginal utility of the collec- 
tive good due to increased supply, it in fact 
brings down the approval rate, because an 
increase in collective good consumption is 
now less valuable in terms of the private 
good. Both effects cause people to behave 
more selfishly. Moreover, weakened ap- 
proval incentives and lower voluntary con- 
tributions mean less social approval. Finally, 
the welfare loss from the colder social cli- 
mate need not be compensated by the wel- 
fare gain from allocating more resources to 
the collective good. 


VI. Concluding Remarks 


As the Introduction already provided a 
summary, I conclude with a few remarks. 
The analysis presented obviously relies on a 
number of restrictive assumptions. Some of 
them have been employed merely to keep 
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the model as simple as possible. At the 
expense of rapidly growing complexity, it 
can be generalized straightforwardly to 
cover many collective goods and various 
kinds of asymmetries, for instance. At least 
two assumptions, namely, the assumed abil- 
ity to react emotionally to behavior of oth- 
ers and the assumed dependence on emo- 
tions and attitudes of others toward oneself, 
are fundamental and indispensable. Al- 
though emotional reactivity and emotional 
dependence hardly can be doubted empiri- 
cally, a complete theory of cooperation 
should also explain how the genes for reac- 
tivity and dependence might have spread 
over the human gene pool. This problem is 
far from trivial. Emotional reactions will 
require some energy, and emotional depen- 
dence renders the respective agents ex- 
ploitable by others who are not dependent, 
so that carriers of either type of gene suffer 
from an uncompensated disadvantage in re- 
productive fitness as long as everybody ben- 
efits in like manner from the collective good. 
Thus, in order to explain reactivity and de- 
pendence as evolutionary-stable phenom- 
ena, one has to give reasons for a compara- 
tively higher collective good consumption of 
reactive and dependent individuals. Pre- 
sumably, this will be possible only in a small 
group setting that actually has been relevant 
for the evolution of human traits because, 
compared to the history of evolution, the 
history of large social formations is negligi- 
bly short. Even nowadays, the bulk of volun- 
tary cooperation occurs in small social units 
such as families, friendships, neighbor- 
hoods, and groups of workmates.” Trivers 
(1971 pp. 48-9) has argued that, within a 
small group setting, gratitude, moral aggres- 
sion, and sympathy have been selected for 
in support of reciprocal altruism, as cooper- 
ation is called in biological terminology, but 


Sit is quite obvious that the main results of the 
model are valid also for the small-group case where the 
affectability of collective good supply only provides an 
additional incentive for cooperation. Apart from theo- 
retical simplicity, the main reason for choosing a large 
group setting was to show that even in the most diffi- 
cult case the approval incentive alone may be sufficient 
to support significant cooperation. 
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his reasoning would. not seem to be fully 
convincing, Lack of space makes it impossi- 
ble to pursue this matter any further here, 
but work of. Axelrod (1984) and Frank (1987) 
on evolutionary problems structurally closely 
related to ours suggests that a solution is 
possible. 


APPENDIX 
Proof of Proposition 5 


Let c° denote total equilibrium supply of the collec- 
tive good (as well as total individual contributions), ¢ 
denote voluntary equilibrium supply (as well as volun- 
tary individual contributions), and # denote the equi- 
librium approval rate. The conditions for market equi- 
librium [compare (11)], approval rate sustainability 
[compare (10)], and optimization [compare (8)] are 


(Al) ul(a —c°)=ul(c®) 
1 n0 
ul(c 
Se ee ifc=0 
usir —c")+ out) 
(A2) w= 
ue(c”) ai 
re ifc>0 
u (mc?) 


(A3) u(r — c?) > wui[(—o)we] 
and equality if ¢ > 0. 


From (10) and (A2), one can conclude W < w* owing to 
c? >c*. Next, I want to show ¢<c*. This is trivial if 
é=0. For ¢> 0, (A3) holds with equality. The corre- 
sponding condition for c* is 


(A4) us (m —c*)=w*ul[(1—o)w*e*]. 


Now, switching from (A4) to (A3), the increase from 
c* to c? increases the left-hand side, and the decrease 
from w* to W decreases the right-hand side. In order 
to compensate for these effects, č must be smaller than 
c*. Thus, voluntary contributions are lower in the new 
equilibrium. Owing to W<w* and ¢<c*, social ap- 
proval is reduced by the amount (1—o)(w*c*—we). 
Finally, the welfare loss from reduced approval evi- 
dently need not be compensated by the welfare gain 
from allocating more private resources to the collective 
good. 
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The Economic Effects of Production Taxes in a Stochastic 
Growth Model > 


By MicHaAEL DotsEy* 


This paper analyzes the effects of stochastic taxes on production in a simple 
stochastic growth model. In so doing. the paper explicitly examines the impor- 
tance of uncertainty on the decisions of individual agents. This importance is 
emphasized by comparing economic outcomes to various realizations of tax rates 
under uncertainty and perfect foresight. (JEL 321) 


The Economic Recovery Tax Act of 1981 
led to the largest postwar. decline in effec- 
tive tax rates on capital. The legislation also 
had its most significant effect on rates in 
1982 because of the rapid decline in infla- 
tion. Although some of the tax cut was 
rescinded in 1982, effective corporate tax 
rates on plant and equipment, measured as 
the difference between before- and after-tax 
rates of return to capital as a percentage of 
before-tax rates of return, remained at his- 
torically low values through 1986. Accompa- 
nying this tax cut is the current economic 
recovery, which began in November 1982. 
During this recovery we have witnessed rel- 
atively large increases in business fixed in- 
vestment, a stock market boom, and a large 
rise in both the ex post and ex ante real 
interest rate. It is, therefore, natura] to in- 
vestigate the linkages between the tax cut 
and the increase in economic activity. 

The effects of this particular tax cut have 
been consistent with a pattern of negative 
correlations between taxes and real interest 
rates, stock prices, investment, and output 
growth observed following previous business 


tax.cuts. For example, using annual data on — 


*Federal Reserve Bank of Richmond, 701 E. Byrd 
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James Hamilton, Tony Kuprianov, Ching Sheng Mao, 
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the course of this research. Bob LaRoche provided 
excellent research assistance. The views expressed in 
this paper are solely those of the author and do not 
necessarily reflect the views of the Federal Reserve 
Bank of Richmond or the Federal Reserve System. 


1168 


the effective tax rates on plant and equip- 
ment reported in Charles R. Hulten and 
James W. Robertson (1982), the correlation 
coefficients between the logarithm of tax 
rates and the logarithms of real GNP, real 
business fixed investment, and the New York 
Stock Exchange price index for the period 
1952-84 are —0.56, —0.55, and —0.65. 
Further, using an autoregression to calcu- 
late expected inflation over the period 
1960-84, the logarithm of one plus the 
ex ante after-tax real rate of interest and the 
logarithm of the effective tax rate display a 
correlation coefficient of —0.38, while the 
coefficient with respect to the logarithm of 
one plus the ex post real rate is — 0.50.' 

While the general effects of the recent tax 
cut are consistent with effects observed in 
other periods, the relative size of the 1981 
tax cut is quite large. For instance, Hulten 
and Robertson calculate that the effective 
tax rate on capital in nonresidential busi- 
ness was reduced from roughly 33 percent 
in 1980 to approximately 1 percent in 1984. 
It is not surprising that a change of this 
magnitude has generated renewed interest 
in the interaction between taxes on capital 
and real economic variables. 

This paper examines the substitution ef- 
fects of tax rate changes within the context 
of a stochastic growth model. Because taxes 
drive a wedge between the marginal prod- 
uct of capital and its after-tax return the 


‘When first differences of the logarithm of real 
GNP and reai business fixed investment are used, the 
correlation coefficients are —0.38 and — 0.06. 
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resulting equilibrium is suboptimal. Further, 
the effects of taxes are highlighted by the 
lump-sum remittance of tax proceeds to in- 
dividuals. The methodology for finding the 
solutions to an individual’s dynamic pro- 
gramming problem relies on the envelope 
theorem and the construction of a policy 
function that simultaneously solves the indi- 
vidual’s optimization problem and market 
clearing conditions. Attacking the problem 
in this way avoids explicit consideration of 
the value function and produces a tractable 
method of analysis for problems in which 
aggregate conditions affect an individual’s 
value function. Essentially, the problem is 
reduced to solving one functional equation 
for policy. 

With the exception of David Bizer and 
Kenneth L. Judd (1988), most of the work 
dealing with the effects of taxes has pro- 
ceeded within the confines of standard non- 
stochastic growth models (e.g., William A. 
Brock and Stephen J. Turnovsky, 1981; 
Andrew B. Abel and Oliver Blanchard, 1982) 
or in models in which agents have perfect 
foresight regarding the path of tax rates 
(e.g., Robert A. Becker, 1985; Lawrence H. 
Goulder and Lawrence H. Summers, 1987). 
Little effort seems to have teen given to 
examining the effects of tax rate changes 
when taxes are explicitly depicted as follow- 
ing a particular stochastic process. 

This paper takes the latter approach and 
investigates the effects of tax changes in a 
stochastic growth model in which only tastes, 
technology, and the stochastic process for 
taxes are exogenous. This procedure takes 
seriously the methodology advocated by 
Robert E. Lucas, Jr. (1976) and Thomas J. 
Sargent (1979) that a policy should be rep- 
resented as a given outcome of some 
stochastic process. The analvsis yields in- 
vestment, output, and real interest rate be- 
havior that depend explicitly on tastes and 
technology parameters as well as the under- 
lying process generating tax rates. 

The qualitative movements in endoge- 
nous variables that are generated by tax 
rate changes in this model are similar in 
some instances to results derived in the 
nonstochastic or perfect foresight models. 
For instance, when a high tax rate is gener- 
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ated (and the rate is expected to remain 
high), agents reduce the capital stock. This 
leads to lower real interest rates and lower 
transitional real output growth. The qualita- 
tive similarity of the results from the various 
models is a natural outcome of the behavior 
of agents, behavior that is characterized by 
analogous intertemporal optimization prob- 
lems in all three classes of models. 

An explicit stochastic treatment of taxes 
allows one to examine the effects of uncer- 
tainty on the decisions of individual agents. 
This uncertainty should be an important 
consideration in determining behavior, since 
tax law changes are fairly frequent and 
changes in inflation do result in significant 
movements in the effective tax rate on capi- 
tal. Also, the exact nature of the uncertainty 
is related to the specific process that taxes 
are assumed to follow. For example, the 
degree of persistence of the process gener- 
ating tax rates will have important implica- 
tions for individual behavior. Therefore, 
agents will show quantitatively different be- 
havior for any specific realizations of tax 
rates when the inherent randomness of taxes 
is modeled as opposed to treating tax rate 
data as being known with certainty. If one 
wants to derive realistic decision rules, then 
the stochastic nature of the agents problem 
needs to be analyzed explicitly. 

The results generated by the stochastic 
growth model derived in this paper produce 
correlations that are consistent with those 
mentioned above. However, the simple 
model examined below is not overly success- 
ful at replicating correlation coefficients at 
various leads and lags, especially when the 
first differences of logs are used. A more 
detailed investigation that tests to see if 
incorporating tax policy into a more sophis- 
ticated model improves that models ability 
to replicate actual time series is needed 
before one can determine the importance of 
taxes On economic activity. 

The paper is structured as follows. Sec- 
tion I contains a general description of the 
model, while Section II examines a particu- 
lar solution for the case of log utility and 
exponential production. Of particular inter- 
est is the derivation of a closed-form solu- 
tion to the nonlinear stochastic difference 
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equations that determine the equilibrium. 
In this section, the importance of the 
stochastic process generating taxes is em- 
‘phasized by examining the cases where taxes 
follow either a two-state Markov process or 
are independently and identically distrib- 
uted. In Section III, the model is extended 
to examine investment tax credits. Since do- 
ing so does not substantially affect the re- 
sults, the treatment is fairly concise. A com- 
parison between capital stocks generated by 
the model for, particular realizations of a 
stochastic process under uncertainty with 
those generated by a perfect foresight model 
is made in Section IV, while a short sum- 
mary is given in Section V. 


I. The Model 


The model is a one-sector stochastic 
growth model consisting of three economic 
entities: firms, consumers, and the govern- 
ment. Individuals are infinitely lived and 
maximize the discounted stream of momen- 
tary utility while firms produce output ac- 
cording to a concave production function, 
f(k), where output is total output that is 
available for production and consumption.* 
The various economic entities interact in 
‘two markets each period. First, there is a 
capital market in which firms purchase capi- 
tal from individuals. Capital is carried over 
from the previous period and is therefore 
supplied inelastically as in Brock (1979). 
Next there is a combined goods and securi- 
ties market in which individuals allocate 
their wealth among goods and securities. 
Individuals also decide how much they will 
consume and how much capital to carry into 
the succeeding period. The government 
taxes away some of the firm’s revenue and 
remits the proceeds lump sum to individu- 
als. Tax rates are stochastic and are an- 
nounced at the beginning of each period so 
that there is no uncertainty over the current 
tax rate, but there is uncertainty over future 
tax rates. 


One could easily think of f(k) as equaling total 
production g(k) minus depreciation D(x). 
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A. Capital Market 


At the beginning of period ¢, the repre- 
sentative individual has carried over k, units 
of capital that are sold to the firm for p, 
units of output per unit of capital. One can 
think of output as seeds that can either be 
eaten (consumed) or invested (carried into 
the next period and sold to firms). In this 
respect the model is similar to Becker (1985) 
and Jean Pierre Danthine and John B. 
Donaldson (1985). The firm maximizes its 
after-tax profits 


(1) d,=(1—1,)f(Kk,) — piki 


where 7 is the effective tax rate on capital. 
The effective tax rate on capital is basically 
an index number that aggregates the effects 
of various components of the tax code (for 
example, the legislated tax rate, deprecia- 
tion allowances, and the effects of inflation). 
It is essentially an attempt to calculate a 
number that is sufficient for determining 
individual behavior in response to various 
changes in taxation (for a detailed descrip- 
tion of its calculation, see Hulten and 
Robertson [1982]).? 

This optimization implies that the price 
of capital is equated to its after-tax marginal 
product; therefore, the price of capital and 
profits are determined as functions of the. 
capital stock and the tax rate. Formally, 
p,=A-—7,)f(k,. This formulation of tax 
on output is chosen for analytical tractabil- 
ity and captures the tax wedge resulting 
from a given effective tax rate on capital. 


B. Goods and Securities Market 


After selling capital to firms and receiving 
the distribution of profits and lump-sum tax 
remissions, individuals choose their current 


"As formulated, equation (1) makes + appear as a 
production tax rather than an income tax. As noted in 
the text, the tax rate that is being used is an effective 
tax rate, not a legislated rate. Also, even if + were just 
the legislated profits tax, since legislated depreciation 
allowances are not based on true economic deprecia- 
tion, the current tax code involves aspects of a wealth 
tax. 
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consumption c,, next period’s holdings of 
capital, k,,,, and their share of the firm 
S;4, Subject to the value of their current 
wealth, w,, where 


(2) w, = (gq, + d,)8,+ p,k, + T,. 


The first term after the equality represents 
the current value of the shares of the firms, 
q,5,, plus dividend payments d,s,. The sec- 
ond term is the payment for capital, and the 
third term is the per capita lump-sum trans- 
fer of tax proceeds. Since all firms have 
access to the same strictly concave produc- 
tion technology, f(k), each firm will utilize 
the same amount of capital, k. In general 
for m firms and n individuals (where m and 
n are large), T,=(n/n)z,f(k,). For sim- 
plicity of exposition, let m=" and hence 
T, = 7,f(K,), where K, is the aggregate per 
capita capital stock. The budget constraint 
facing the individual is 


(3) Cyt kipi taS S My: 


C. The Individuals Maximizaiion 


The individual’s problem is to maximize 
his discounted expected utility 


(4) U=E Y piule;) 


j=t 


subject to his budget constraint (3). 

The solution to the competitive equilib- 
rium for the economy described above is 
equivalent to the solution of a planner’s 
problem in which the planner faces a 
stochastic discount rate. In particular, the 
competitive equilibrium is equivalent to the 
problem where the planner maximizes 


()) - U= EB, bi-u(e;) 


j>t 
subject to a sequence of constraints 


(3°) c, + kisis f(k) 
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where b” =p" 'T]?_(1—7,). This plan- 
ning prcblem is the stochastic analogue of 
the one presented in Becker (1985). The 
solution involves the maximization of a con- 
cave function over a convex set and is, 
therefore, unique. Generally, it is difficult to 
find an equivalence between a planner’s 
problem and a competitive equilibrium in- 
volving a tax structure that is more involved 
than the one analyzed here. For that rea- 
son, the paper proceeds by examining the 
solution to the competitive equilibrium di- 
rectly. The problem is first posed in terms 
of dynamic programming and is formally 
stated as 


(5) V&k,K,s,7) 
= max{u(c) + BEV(k', K',s',7')} 
Gks 
such that 


c+k'+qs'<w 


and w=(q+t+d)s+pk+T 


where per capita capital is determined by 
K'= WV(K,r). The price of capital, p(K,7), 
and profits, d(K,7), are given by the solu- 
tion to the firm’s problem. Tax rebates are 
given by J,=7,f(K,), and asset prices are 
determined by the function g(K,7). In solv- 
ing their optimization problem, individuals 
take as given (K,T), d(K,r), and p(K,7), 
as well as the policy function that describes 
the transition path of the aggregate per 
capita capital stock K'= ‘V(K,7). Further, 
the notation E is used to denote the con- 
ditional-expectations operator where expec- 
tations are conditioned on all current 
information, including the realization of this 
pėriod’s tax rate. 

In order to guarantee a unique solution 
to the value function, one must place fairly 
severe restrictions on taste and technology. 
Wilbur John Coleman II (1989) presents a 
detailed analysis of existence and unique- 
ness problems in a setting similar to the one 
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examined here.* These restrictions are not 
met by the examples considered in Section 
II. However, one can verify that the solu- 
tions presented below satisfy both the Euler 
equations and the transversality conditions 
of the planner’s problem and are, therefore, 
the correct solutions. 

The first-order conditions for the con- 
sumer’s problem are given by 


(6a) Du(c)=A 
(6b) BED V(k',K’,s’,7’) =A 
(6c) BED V(k', K’,s',7')=Aq 


and the budget constraint (3). The notation 
D, represents the partial-derivative operator 
with respect to a function’s ith argument, 
and A is the Lagrange multiplier associated 
with the individual’s budget constraint. 
Equation (6a) implies that the marginal 
utility of consumption equals the marginal 
utility of wealth. That is, individuals are 
indifferent between consuming or holding 
an extra unit of wealth. Equation (6b) states 
that the discounted value of next period’s 
marginal utility of wealth times the after-tax 
marginal productivity of capital equals the 
marginal utility of wealth. This means that 
at an optimum the individual is indifferent 
between investing in an extra unit of capital 
and consuming less today. Equation (6c) is 
the difference equation determining the 
price of equity. It implies an indifference at 
the margin of selling equity today and hold- 
‘ing the equity and selling it next period. 


D. Equilibrium 


The solution to the competitive equilib- 
rium involves finding the individual’s policy 
function k’=wW(k,K,s,r) that simultane- 


“The necessary conditions for the existence of a 
unique V satisfying (5) are (a) momentary utility must 
be a strictly concave, twice continuously differentiable 
function that is bounded from below with the property 
that u'(0) =œ; (b) the production function is also strictly 
concave and twice continuously differentiable with f(0) 
=0, 1/8 <b < f'(0) < B <œ, and there exists a finite k 
such that f(k) < k; and (c) finally, (1—7r)f'(k) must be 
strictly decreasing in k. 
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ously satisfies his first-order conditions, is 
consistent with market clearing in the goods 
and asset markets, and yields individual 
holdings of capital equal to the aggregate 
per capita capital stock. Formally, the equi- 
librium conditions are 


(7a) s=] 
(7b) . c+k'= f(k) 
(7c) k=K. 


Because individuals are too small to affect 
the aggregate capital stock and taxes are 
distortionary, the solution to the competi- 
tive equilibrium is suboptimal. 

Solving for the competitive equilibrium 
involves making use of the envelope theo- 
rem to obtain D,V(k,K,s,7)=Ap and 
DV(k, K, 8,7) = xq + d). atp (6b) 
can be rewritten as 


(8a) 
Employing the equilibrium conditions 
(7a)-(7c) and wk, k,1,7) = P(k,7)= 


h(k,7), the solution to the problem involves 
finding a function A that satisfies 


(8b) BEDu| f(h(k,7)) — A(A(k,7),7')] 
x(1= r) f'(A(k,7)) 
= Dul f(k)—h(k,7)|. 


BEDu(c’) p'= Du(c). 


Finally, the solution for equity prices is given 
by 


(8c) BE|A(q'+d')| = 


In general, proving the existence of the 
policy function h(k,r) requires the same 
conditions as those already needed to guar- 
antee the existence of the individual’s value 
function, while uniqueness requires a few 
additional restrictions (see Coleman, 1989). 
Since the problem under consideration is 
equivalent to a planning problem, one need 
only check that a policy function satisfying 
(8b) also solves the planner’s problem. Upon 
finding h(k,z), it is then possible to con- 
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struct the equilibrium stochastic process for 
capital, consumption, the price of capital, 
and security prices. 


H. An Example 
A. A Particular Solution 


Since the main concern of the analysis is 
to examine how various realizations of tax 
rates affect economic activity, the remainder 
of the paper will examine a particular func- 
tional form for which a closed-form solution 
exists. In particular, the utility function u(c) 

= In(c) and the production function f(k)= 
k*,0<a<1, will be emphasized.” The re- 
sults of this section will concentrate on the 
effects of production taxes, tut a brief in- 
vestigation of investment tax credits will be 
undertaken in Section III. The closed-form 
solutions to these problems are nontrivial 
and involve the solution to a set of nonlin- 
ear difference equations. 

Without the remittance of taxes (ie., if 
tax proceeds were destroyed), this problem 
would be equivalent to the one solved by 
Brock (1979), where the production func- 
tion was subject to multiplicative productiv- 
ity shocks. In that case, the fraction of out- 
put allocated to investment would equal af 
and be independent of the stochastic pro- 
cess followed by taxes. The assumption that 
all tax proceeds are remitted lump sum is, 
to some extent, like assuming that govern- 
ment spending is valued exactly like con- 
sumption. This allows one to ignore the 
effects of government spending and to iso- 
late the effect of taxes. Also, the remittance 
of these proceeds implies that it is the com- 
pensated effects of consumption that are 
being analyzed and that even in the pres- 


Because results using log utility sometimes repre- 
sent a special case, simulations were run using con- 
stant-relative-risk-aversion utility functions with risk- 
aversion parameters of 1/2 and 2. Also, a production 
technology incorporating less than full depreciation 
f(k)=k*+(~ €)k was analyzed. None of the results 
involving quantities was qualitatively affected either 
with respect to correlation coefficients or the compari- 
son of capital paths under uncertainty and perfect 
foresight. I am indebted to Ching Sheng Mao for 
running simulations using his rea]-business-cycle model. 


DOTSEY: PRODUCTION TAXES IN A STOCHASTIC GROWTH MODEL 1173 


ence of log utility, expectations of future 
taxes will be an important determinant of 
current decisions. 

The procedure used to calculate the com- 
petitive equilibrium is to find a policy func- 
tion that satisfies the particular form of the 
difference equation given in (8b) that is 
generated in this example. Alternatively, one 
could try to find the value function, but this 
latter procedure does not appear promising. 
In problems involving suboptimal equilibria 
where aggregate’s state variables appear in 
the individual’s value function, finding a 
policy function seems to be a precondition 
to calculating the value function. 

With taxes either following a first-order 
Markov process or distributed indepen- 
dently and identically, an intuitive guess re- 
garding the decision rules governing con- 
sumption and investment is that each is a 
fraction of output and that these fractions 
are potentially functions of the current real- 
ization of the tax rate (past realizations 
would be important for Markov processes of 
higher order). That decision rules follow a 
linear process can be directly derived using 
the symmetry equilibria procedures in John 
H. Boyd III (1986).© In particular, k’= 
yr) f{k) and c=(1— y(r))f(k) where 


aBg(1—7) 


(9) y(r) = ERTE 


and g(1-— 7r) is given by the recursive rela- 
tionship 


(10) g(1-r7) 


= E|(1— r') [1+ aßg(1-— 7) |] : 


The function g(1-— 7) is unique since it is 


°The symmetry can be written as S(p,, q,, 
C18 Ky WT) = Apgar Portiin Ap 4115 Sp ÀK 
A ,5A¢447,), where À 1 = Af. This symmetry maps 
initial capital k into Ak, preserves household budget 
constraints, the firm’s objective, the government’s bud- 
get, the definition of profits, and market clearing. It 
also implies that the choice of capital and consumption 
is linear in income and depends on the tax process. I 
am indebted to John Boyd for describing ae methodol- 
ogy to nie. a 
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the fixed point of a contraction operator 
implicitly defined by (10).’ That this is a 
solution to economy-wide equilibrium is 
shown in the Appendix. 

The closed-form solutions for c and k’ 
indicate that the mixture between consump- 
tion and investment is based on the condi- 
tional expectation of the entire path of fu- 
ture taxes. The expected path of future taxes 
is relevant since it affects the value of future 
capital and the amount of consumption that 
can b purchased from the sale of capital to 
firms. 


B. The Solution When Taxes Are i.i.d. 
(Independently and Identically Distributed ) 


Further intuition regarding the economic 
effects of tax rate changes can be gained by 
looking at the results obtained when taxes 
follow a particular stochastic process. To 
highlight the difference between permanent 
and transitory tax rate movements, both an 
iid. and a simple two-state first-order 
Markcv process will be used. The behaviors 
of investment, equity prices, and the real 
rate of interest differ quite markedly under 
the two assumed distributions. These exam- 
ples, therefore, clearly illustrate the impor- 
tance of correctly specifying the stochastic 
process for taxes if one is to have confi- 
dence in the derived consequences of tax 
rate changes. 

When taxes are independently and iden- 
tically distributed, with mean 7, certain- 
ty equivalence obtains, and g(1—17,)= 
(1-— 7)/Q — aBU— 7), implying that 
y(1-—r,)=aß(1-— 7). Hence the fraction of 
output devoted to investment is indepen- 
dent of the current realization of taxes, but 
as in the deterministic tax literature, invest- 
ment will be lower for higher average tax 
rates. Further, the time path of*consump- 
tion and the capital stock are independent 
of tax realizations as long as tax proceeds 
are rebated. 

Regarding security prices, equation (8c) 
represents a first-order difference equation 


’This last statement was pointed out by an anony- 
mous referee. 


? 
ot ? 
ait 
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that along with (1), the fact that u'(c)=A, 
and the solutions for y(1— 7) and g(1—7) 
yields® 


dejt 


(11) d, 7 pa, E PE 


Ct+j+1 


=f(1-a)c, E, 2 Big(1- Tij). 
0 


j= 


Security prices are observed to equal the 
stream of after-tax profits discounted by the 
risk-adjusted real rate of interest. As in 
Brock (1979), the asset price represents a 
return to the technology k“ and is directly 
related to profits share of output (1— a), as 
well as to the discount rate B. 

Using the solution for g(1— 7r) implies 
that 


B(1- a)(1—7) 


(12) d, = (= B) Yı 


and that equity prices are invariant tọ tax 
rate realizations. Similarly, it can be shown 
that the after-tax real rate of interest is 
invariant to realizations of the tax rate. 


C. The Solution When Tax Rates 
Are Persistent 


When tax rates are no longer indepen- 
dently and identically distributed, the prop- 
erty of certainty equivalence no longer holds. 
However, with taxes following a Markov 


The second equality in (11) is derived using (1), 
(6b), and the envelope thearem. Using the equation for 
profits (1) and the consumption function, 


1 Tisi 


ee iT Yri) 


di41/Cr+1= 


Equation (8b) leads to the difference equation 


1-445 P3 ¥(7,) 
‘1 Y(T) i= y(7,) 


From the definition of g(1— 7,), the latter expression is 
also equal to aBg(1—7,). 


aBpE 
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process, the recursive nature of g(1— r) can 
be used to derive closed-form solutions. For 
example, consider a first-order Markov pro- 
cess with transition probabilities given by 


(13a) prob(7,, m TolT, a 7) = ny 
(13b) prob(7,4.,=7,l7,=7,) = 7; 


where 0 < To < 7; <1. Given the discussions 
in Robert J. Barro (1979) and Robert E. 
Lucas and Nancy L. Stokey (1983) concérn- 
ing the use of government debt to smooth 
tax distortions over time, one would expect 
the tax rate to show a good deal of persis- 
tance, with both a, and 7, being greater 
than 1/2 and perhaps close to one. Further, 
the qualitative results generated by using 
the simple process described in (13a) and 
(13b) are not altered by using a Markov 
process having more than two states. There- 
fore, the qualitative results yielded by this 
example are of general interest. Taking ad- 
vantage of the recursive nature of g(1—7,) 
yields 


(14) g(1—7,) 
=|(1— r )m; +(1—7,)(1— 7) 
-aß(1-— 7;)(1- 7;)8]/A 
where 
A=1-ap|(1-r)m;+(1- r); 
+a B|- r;)(1- r;)8] 


for i, j=0,1, i* j, and ô= m; + m;—1. 

i. Consumption and Investment. The be- 
havior of consumption and investment will 
now be strikingly different than in the 1.i.d 
case. For the Markov process under consid- 
eration, 


(15) g(1-79)-g(1- 7) 


1 
= z [(t1-70)(70 + 71-1]. 
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Equation (15) implies that g1—7,)> g0 — 
7,) if mo +m,>1, that is, if tax rates are 
likely to persist over time. From the defini- 
tion of y(r), this means y(r,)> y(rp. 
Hence, a greater fraction of output will be 
invested in the low tax state if the low tax 
state implies that taxes are more likely to be 
low in the future. Also, the difference be- 
tween y(7,) and y(7,) will be directly re- 
lated to the degree of persistence, a result 
that is analogous to behavior reported in 
Bizer and Judd (1988). 

A comparison between these two pro- 
cesses, a first-order Markov process with 
To +%,>1 and an i.i.d. process, points out 
the problems that arise if one simply as- 
sumes that agents have perfect foresight. 
The behavior of investment for some given 
realization of taxes depends crucially on the 
distribution from which tax realizations were 
drawn. As will be shown in more detail in 
Section IV, arbitrarily forcing expectations 
to equal actual realizations may be mislead- 
ing and certainly affects the quantitative 
results of the analysis. 

Using the simple nature of the closed- 
form solutions, the correlation coefficients 
between tax rates and output, consumption, 
or the capital stock can be calculated. These 
correlations may be of more interest than 
the simple comparative static exercise just 
conducted, since they facilitate a compari- 
son of the model with actual data. With 
100-percent depreciation, the correlation 
coefficients between taxes and output and 
between taxes and capital are equivalent, 
and only the latter is presented. Numerical 
results for a=1/3, B = 0.97, mg = m; = 0.9, 
To =1/4, and 7,=1/2 are presented in 
Table 1. 

Although straightforward, the calcula- 
tions are cumbersome. To avoid burdening 
the reader, only the derivation for the co- 
variance between 7,=log(7,) and k,,,= 
log(k,,,) is presented. The calculation of 
covariances at other leads and lags is simi- 
lar. From the definition of covariance, 


(16) Cov(#,,k,41) E E(î k, 41) 
7 E(#,) E(k,41). 
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- TABLE 1— CORRELATION COEFFICIENTS BETWEEN lnr) AND Aln y,AIn(1+ p), Alng 


(a) Predicted by the Model! | (b) Actual Data? 
x,=Ain(y, x,=Aln(l+p,) x,=Alng, x,=Aln(y,) x,=Alnd+p,) x,=Aln(q,) 


cor(In 7, X,43) 0.31 0.016 0.27 0.14 0.17 —0.26 
cor(In 7, X,4) 0.17 0.12 0.10 — 0.009 = 0.03 —0.18 
cor(In 7, x, 4.1) — 0.43 0.47 —0.50 —0.22 —0.25 — 0.007 
cor(In r, x,) -0.35 | —0.096 — 0.28 ~0.38 —0.56 0.10 
cor(In r, x,..1) ~0.27 —0.075 —0.23 ~0.001 ~0.22 ` 0.02 
cor(in 7, x,-) —0.22 — 0.066 -0.15 -0.19 -0.16 ` -0.01 
cor(In 7, x, 3) —0.18 — 0.025 -014 ` 0.25 ~0.13 —0.10 


"The correlation coefficients from the model are calculated by using parameter values of B= 0.97, a =1/3, II 


_{09 oi) _ K 
| (0; aa ry = 0.25, and 7, = 0.50. 


The effective tax rate series is from Hulten and Robertson (1982) for the years 1952-84. The data on ex ante 


expected real interest rates subtracts estimates of expected inflation calculated by using an autoregression from the 
one-year municipal bond rates. The series extend from 1960 to 1986. -> 


Using the recursive relationship kia = 
j(r,)+ ak, and the stationarity of the 
model, (16) can be written as 


P 


(17) Cov k1) 
= El7,(9(7,) + ay( 7,1) 
+ a? (7,1) F sse) 


_ E(P)ECS(T)) 

l-a l 
Using the Markov property of r and y 
yields - 


(18) Cov(#,,k,.1) 





where ô= 2m —1. Calculating the variance 
of 7 is straightforward, while deriving the 
variance of k,,; is somewhat more involved, 
since it equals [1/( — a” Var($(r)) + 
2Covw( (7), k,4.,)). Letting r(j) = correlation 
(7,,k,,;) and using the above parameter 
values, one finds that [r(3), r(2), r(), r(0), 
r(—1), r(—2), r(—3)] equals [— 0.76, — 0.90, 
—0.98, — 0.78, — 0.62, — 0.49, — 0.37]. The 
largest effect occurs at one lead and gradu- 
ally decays in both directions. This pattern 
explains why the calculated correlation co- 
efficients involving output in Table 1, which 
involve first differences of the data, are pos- 


— itive at leads 2 and 3. 


ii. Security Prices. Using equation (11) 
and the fact that r follows a two-state 
Markov process, the price of an equity claim 
can be simplified to | 


= (5)(of 0+ att oP + ) a 
7 ea NSC) _ (l- a) BO y(7,)) 
| ay OG B= 88) 
l-—a@ x y|(1— Ba;) (1-7) 


where II is the probability transition matrix 


(3 -77 Z) The solution reduces to 
ar "TT : 


(19) Cov(#,,k-+1) 


_ 1 (faê Dlr) — (7) 
4 1—aéd 


+ B(1—a,)g(A-7,)| 


for i,j =0,1 and i#j. For the case where 
ma = mT; =m, equation (20) along with the 


. definition of y(+) implies that the sign of 


q(r))— q(7,) is the same as the sign of 
ig = 79) = eG 7) = po pA = 
aw g(l—t))+gQ—7,))]. When w>1/2, 
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the first bracketed expression is positive, 
while the second depends on the values of 
the relevant parameters. For 7 approaching 
unity, the second term is positive. The rea- 
son that the sign of q(7))— q(7,) is ambigu- 
ous is because equity prices are affected 
through two channels. With lower taxes and 
persistence of the tax process, individuals 
expect higher dividends, driving equity 
prices up. However, a low tax state implies 
lower current consumption, which tends to 
lower equity prices. 

Of more interest are the correlation co- 
efficients between equity prices and tax 
rates. Unlike those for output, their sign is 
not so clear-cut. However, it generally does 
not take much persistence to yield a nega- 
tive correlation between q,,, and 7, for 
j=0,1,2,3. This result occurs tecause, when 
tax rates persist, consumption will rise over 
time when taxes are low and so too will 
equity prices. Therefore, low tax rates are 
more likely to be associated with high secu- 
rity prices. The vector of correlation coef- 
ficients for the parameter values used in the 
previous section is [— 0.74, —0.84, — 0.40, 
— 0.32, — 0.24, - 0.16], with the peak occur- 
ring at lead two. For less persistence, a = 
0.8, a similar pattern was obtained, while 
for m = 0.55, all but the contemporaneous 
and first lagged correlation coefficient were 
negative. Further, since effective tax rates 
are inversely related to inflation, the model 
is capable of generating the negative corre- 
lation between stock prices and inflation 
that is commonly observed in the United 
States. 

The results of this section can be viewed 
as a stochastic analogue of the results in 
Brock and Turnovsky (1981) in the case 
where firms use equity financing. In their 
article, share value would rise during an 
adjustment from a high to low tax rate, 
while equity prices would initially fall and 
then rise as consumption increased to its 
new steady-state value. However, in their 
model the steady-state value af equity prices 
only depends on corporate taxes through 
their effect on the marginal utility of con- 
sumption. This feature is due to the simple 
nature of the process for dividends in which 
dividends do not vary with corporate tax 
rates. 
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iii. The After-Tax Real Rate Of Interest. 
Calculating the equilibrium after-tax- risk- 
free real rate of interest, p, is accomplished 
by considering the price, pf, of a tax-free 
bond, B, that yields one unit of consump- 
tion next period. This is done by adding 
pFB,,, to the left-hand side of (3); and B, 
to the definition of wealth. The resulting 
first-order condition with respect to B,,, is 


(21) 


Equation (21) implies that an individual is 
indifferent at the margin between sacrificing 
p; units of wealth today for one: unit of 
wealth next period. 

Using the expressions for consumption 
and investment and the fact that 1 i- 
y(7,)]=1+ aßg -— r7r,) implies that 


17c, 
BE1/c,41) 


E 1+ aBg(1—7,) 
- BE(1+ aBg(1-7,,4,)) 


Ey Ay sy = À Pr- 


1 
(22) a aay 


Xy(7,) ye! 


For the case where 7+ follows the process 
given by (13a) and (13b) and where taxes 
show persistence (mro, m; > 1/2), -it can be 
shown that the after-tax real rate of interest 
is higher when a low tax state is realized.’ 
This rise in real rates occurs because a 
lowering of the tax rate indicates that capi- 
tal will be more valuable in the future and 
that there will be more future output. Indi- 
viduals will, therefore, value a unit of future 


The proof of this relies on the facts that if 79,77, 
>1/2 then g(1—7))> g(1— rı) and that 


1+ aBg(1-—79) 
Bi so aBEg(1— Tigil, = To) 


S 1+ aBg(1—7}) 
BO + apEg(i — T4417, = 71) i 


The proof of the latter inequality involves some cum- 
bersome algebra and is omitted. 
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wealth by less, causing p* to fall and 1+ p, 
to rise. Alternatively, individuals will wish to 
accumulate more capital. In order to induce 
lower consumption, the real rate of interest 
must rise. 

Regarding correlation coefficients, the 
contemporaneous and lagged correlations 
are all negative, while the lead correlations 
are generally positive (the exception being 
Cov(?,, In(i+ p,.,) when a > 0.9). 

iv. A Comparison with the Data. Because 
the time-series of output, real interest rates, 
and stock prices appear to be nonstationary, 
correlations between the log of the effective 
tax rate on capital and the difference of the 
logs of the various series were used. In 
Table 1, the actual correlation coefficients 
are compared with those generated by the 
model. The results are not overly encourag- 
ing, but it would be presumptuous to be- 
lieve that the extraordinarily simple model 
in this section would closely mimic the data. 
In addition to the model’s simplicity, the 
actual data, particularly the effective tax-rate 
series and the ex ante real rate, suffer from 
measurement error. A more thorough anal- 
ysis, which is beyond the scope of this pa- 
per, would incorporate taxes into a more 
detailed model of the economy to see if 
including taxes would significantly improve 
the model’s ability to replicate the data. 

For the model in this section the best 
results occur for output, where the contem- 
poraneous and first leaded correlation co- 
efficient are not too different from those 
predicted by the model. The model’s results 
for ex ante after-tax real rates exhibit some 
of the same sign patterns as the data, while 
the actual data for equity prices do not 
match the model at all. 


HI. Investment Tax Credits 


As claimed earlier in the paper, the sim- 
plified treatment of the tax structure does 
not change the basic message, that uncer- 
tainty should be explicitly modeled. Also, as 
shown in Section IV, explicitly modeling 
uncertainty, as opposed to assuming perfect 
foresight, tends to reduce the range of val- 
ues attained by the capital stock for both 
the case of a tax on production and an 
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investment tax credit. To make things sim- 
ple, the effects of a stochastic investment 
tax credit are examined separately rather 
than studying both production taxes and the 
investment tax credit simultaneously. 

The change in the model is straightfor- 
ward. Lump-sum government transfers are 
now negative and equal — 6,K,,,, where 90, 
is the investment tax credit at time ¢. Its 
value is known at time ¢, but its future 
values are uncertain. Each individual re- 
ceives 6,k,,,, and this term is added to the 
right-hand side of the individual’s budget 
constraint. That is, instead of (3), we now 
have 


(3): cpt kaitai Wt OK, sy 
and equation (6b) would now be 
(6b') BED V(k', K',s’,0') =A(1—- 8). 


With exponential production and log util- 
ity, the proportion of output invested is 
given by 


(7) aa -0P 8? 


~ 1+ aBe(6) 


where g(0) is determined by the recursive 
relationship 


1 
(10) 8(9)=7—, [1+ Eapg(6')]. 


From (10') one observes that the current 
investment tax credit, unlike the current 
profits tax, directly affects investment, since 
it influences the relative value of k,,,. 
Therefore, even if 0 were distributed inde- 
pendently and identically, investment would 
be influenced by realizations of 8. 

It is straightforward but tedious to show 
that, for reasonable parameter values when 
0 follows a two-state Markov process and 
when tax credits show persistence, invest- 
ment is higher when @ is high. Further, 
since investment decisions also involve ex- 
pectations of future productivity, which de- 
pend on future levels of capital, the persis- 
tence of the process generating @ will be 
important. Also, the more persistent @ is, 
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TABLE 2 

yit,) Under k,4.1 Under y(r,) Under k,+1 Under 

Time Period Tax Rate Perfect Foresight Perfect Foresight Uncertainty Uncertainty 
0 0.095 0.095 
1 0.50 0.167 0.076 0.18 0.082 
2 0.50 0.167 0.071 0.18 0.078 
3 0.50 0.167 0.069 0.18 0.077 
4 0.50 0.167 0.068 0.18 0.077 
5 0.50 0.167 0.068 0.18 0.076 
6 0.50 0.167 0.068 0.18 0.076 
7 0.50 . 0.169 0.069 0.18 0.076 
8 0.50 0.180 0.074 0.18 - 0.076 
9 0.50 0.250 0.105 0.18 0.076 
10 0.25 0.250 0.118 0,24 0.102 
11 0.25 0.250 0.123 0.24 0.112 
12 0.25 0.250 0.124 0.24 0.116 
13 0.25 0.250 0.125 0.24 0.117 
14 0.25 0.250 0.125 0.24 0.117 
15 0.25 0.245 0.123 0.24 0.118 
16 0.25 0.231 9.115 0.24 0.118 
17 0.50 0.167 0.081 0.18 0.089 
18 0.50 0.167 0.072 0.18 0.080 
19 0.50 0.167 0.069 ; 0.18 0.078 
20 0.50 0.167 9.068 0.18 0.077 


the greater the difference in (0) when tax 
credits are high as opposed to low. Again, 
this result is similar to one reported in Bizer 
and Judd (1988). Therefore, taking account 
of the exact nature of the generating mech- 
anism for @ is necessary for analyzing eco- 
nomic behavior. 


IV. A Numerical Example 


In this section, a direct comparison is 
made between capital stock accumulation 
under perfect foresight and situations where 
agents behave under uncertainty. To per- 
form the experiment, 20 realizations of tax 
rates and investment tax credit rates were 
generated using a random number genera- 
tor and the actual transition probabilities 
associated with the effective tax rate series 
given in Hulten and Robertson (1982).'° The 


For the production tax, the exact procedure was 
to look at the Hulten and Robertson effective tax rate 
series as a realization of a two-state first-order Markov 
process and use the calculated sample transition prob- 
abilities and sample means. Then, 20 random numbers 
between zero and one were generated. It was assumed 
that the initial tax rate was high and that a number 


parameter values used were a = 1/3, B =1, 
To= 1/4, 7,=1/2, and wy =m; =0.9. For 
investment tax credits, 0, = 0.10 and 6, = 0. 
The realizations and resulting levels of the 
capital stock are given in Tables 2 and 3. 
The starting value of the capital stock for 
the first experiment is the value that would 
result if + = 0.375 (the average value of the 
tax rate) for all time and, for the second 
experiment, the value that would result if 
6 = 0.05 for all time. In calculating the val- 
ues under the assumption of perfect fore- 
sight, it was assumed that g(1—7,,)= 0.60 
and g(@,,)=1.765. These assumptions are 
important regarding the solution for the last 
period’s capital stock, but the terminal val- 
ues have almost no effect on the numbers 
reported in Tables 2 and 3. 

From Table 2, it is clear that the capital 
stock under uncertainty behaves in a quanti- 
tatively different manner than it does under 
perfect foresight. The movements in capital 


between 0 and 0.1 implied a change in the tax rate, 
while a number between 0.1 and 1.0 implied that the 
tax rate remained at its previous value. For the invest- 
ment tax credit, no actual series was available. It was 
assumed that mg =m; = 0.9 for this process as well. 
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TABLE 3 
Investment 7(@,) Under K,+1 Under 7(0,) Under K,+1 Under 

Time Period Tax Credit Perfect Foresight Perfect Foresight “Uncertainty Uncertainty 
0 0.208 0.208 

1 0.10 0.370 0.219 0.368 0.218 

2 0.10 0.370 0.223 0.368 0.221 

3 0.10 0.370 0.225 0.368 0.223 

4 0.10 0.370 0.225 0.368 0.223 

5 0.10 0.370 0.225 0.368 0.223 

6 0.10 | 0.370 0.225 0.368 0.223 

7 0.10 0.369 0.224 0.368 0.223 ` 

8 0.10 0.366 0.222 0.368 0.223 

9 0.10 0.357 0.216 0.368 0.223 
10 0 0.333 0.200 0.335 0.203 
11 0 0.333 0.195 0.335 0.197 
12 0 0.333 0.193 0.335 0.195 
13 0 0.333 0.193 0.335 0.194 
14 0 0.335 0.193 0.335 0.194 
15 0 0.338 0.195 0.335 0.194 
16 0 0.346 0.201 0.335 0.194 
17 0.10 0.370 0.217 0.368 0.213 
18 | 0.10 0.370 0.223 0.368 0.220 
19 0.10 0.370 0.224 0.368 0.222 
20 0.10 0.370 0.225 0.368 0.223 
tend to be smoothed out by uncertainty, V. Summary 


since there is always some positive probabil- 
ity that next period’s tax rate will be differ- 
ent from today’s. Also, under perfect fore- 
sight, agents respond one period sooner to 
tax rate changes than do agents who are 
unsure of the value of next period’s tax rate. 

This difference in behavior would also 
occur if taxes were independently and iden- 
tically distributed. Under uncertainty, the 
level of the capital stock would remain at 
0.095, independent of the actual realiza- 
tions of taxes, while with perfect foresight 
the capital. stock would respond consider- 
ably. Therefore, if one’ is to predict accu- 
rately how agents will respond to tax rate 
changes, one must carefully consider the 
forecasting problem facing agents. Doing so 
requires an explicit stochastic treatment of 
the problem. 

Regarding Table 3 one observes that the 
relative range of capital values is slightly 
smaller under uncertainty and that agents 
do not alter their behavior prior to new 
realizations of the investmént tax credit. 
However, unlike. the case with a tax on 
production, one-period movements in the 
capital stock can be somewhat sharper un- 
der uncertainty. 


This article analyzes the effects of taxes 
on production in a simple stochastic growth 


model. The paper . represents an advance 


since it treats tax rates as inherently 
stochastic. As shown, the actual process 
generating taxes is an important determi- 
nant in understanding how the economy will 
behave with respect to particular realiza- 
tions of tax rates. 

One might also wish to explore the inter- 
action between tax changes and nominal 
magnitudes such as inflation and the nomi- 
nal interest rate. Extending the model to 
incorporate money via a cash-in-advance 
constraint is not a problem. The basic solu- 
tion governing the real side of the economy 
is unchanged if the cash-in-advance con- 
straint is only on consumption and if one 
rules out precautionary demands for cash. 
This is easily done so long as monetary 
growth is not overly deflationary. The solu- 
tion is only slightly changed if the cash-in- 
advance constraint includes capital as well. 
In these cases, a decrease in money growth 
causes capital accumulation and an increase 
in output. For given money growth, the price 
level falls, and the economy experiences 
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lower inflation. Since little is to be gained 
through these additions, the paper concen- 
trates on real economic variables. 


APPENDIX 
This Appendix provides a more formal description 
of equilibrium and shows that the solution in the text is 
an equilibrium: {p*,q7*,c*,x}*,s*,k7-,77,T,* Fo is 
an equilibrium given the tax process {7,*}?_9 if 


(Al) {c*,57*,x*} solves the consumer’s problem 


max E 





L pulen! 


t=0 
subject to 


CP Saat a Sis (q* Ea )s, + pfx, +7," 


Xp=k 
Sg=l 
where u(c,) = Inc,; 
(A2) {x;**} solves the firms problem 


max (1—7/*) f(k,)— pK, 
where f(k,)=k? and k, > 0; 


(A3) me = (1 T ioe \f(kž)-— pkr: 


(A4) ki = x; 
(AS) s* =1; 
and 

(A6) T,= rf (kë) 


where 7 follows a first-order Markov process is 
satisfied. Substituting the postulated solutions 
c =[1— y(r)]k®* and k' = y(r)k® into (7b) yields 


1 1-7’ 1 


The solution to this nonlinear first-order difference 
equation is satisfied for 


aßg(1-r) 


STE aBg(1—7) 


where g(1—7,)=E{[Q—-7,, DIL +eBE,4 [0 - 7142) 
X[1+ aBE,.[... 
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or 
g(l-r)=E{(1—7')[1+ aBg(1- 7’) }}. 


This solution satisfies all six equilibrium conditions. 
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Deposit Insurance, Risk, and Market Power in Banking 


By MIcHAkEL C. KEELEY* 


A fixed-rate deposit insurance system provides a moral hazard for excessive risk 
taking and is not viable absent regulation. Although the deposit insurance 
system appears to have worked remarkably well over most of its 50-year history, 
major problems began to appear in the early 1980’s. This paper tests the- 
hypothesis that increases in competition caused bank charter values to decline, 
which in turn caused banks to increase default risk through increases in asset 


risk and reductions in capital. (JEL 600) 


It has long been recognized that a fixed- 
rate deposit insurance system, such as the 
Federal Deposit Insurance Corporation’s 
(FDIC’s), or the Federal Savings and Loan 
Insurance Corporation’s (FSLIC’s) can pose 
a moral hazard for excessive risk taking. 
The reason is that banks or thrifts can bor- 
row at or below the risk-free rate by issuing 
insured deposits and then investing the pro- 
ceeds in risky assets with higher expected 
yields. 

As Robert C. Merton (1977) has shown, 
deposit insurance can be viewed as a put 
option on the value of a bank’s assets at a 
strike price equal to the promised maturity 
value of its debt. Under a fixed-rate system, 
banks potentially can transfer wealth from 
the insuring agency, and, absent regulation, 
banks seeking to maximize the value of their 
equity will maximize the value of the put by 


*Vice President, Cornerstone Research, 1000 El 
Camino Real, Menlo Park, CA 94025. Much of the 
research in this paper was conducted while the author 
was a research officer at the Federal Reserve Bank of 
San Francisco. However, opinions expressed herein are 
those of the author and do not necessarily reflect the 
view of the Federal Reserve Bank of San Francisco, 
the Board of Governors of the Federal Reserve Sys- 
tem, or Cornerstone Research. An earlier version of 
this paper was presented at Garn Institute of Finance’s 
academic symposium on deposit insurance. Comments 
from William Beaver, Jack Beebe, Barbara Bennett, 
Mark Flannery, Christopher James, Ed Kane, Stuart 
Myers, Randall Pozdena, Anthony Saunders, and two 
anonymous referees are greatly appreciated. Alice 
Jacobson provided expert research assistance. The 
usual caveats apply, however. 


1183 


increasing asset risk and/or minimizing in- 
vested capital relative to assets. 

Empirical research, however, does not 
seem to show that banks in general maxi- 
mize the put option value. For one thing, 
many banks hold substantially more capital 
than the required amounts (Michael C. 
Keeley, 1988) and for another, researchers 
have found that for many banks, the value 
of the deposit insurance option is less than 
its price (Allan J. Marcus and Israel Shaked, 
1984; Ehud Ronn and Avinash K. Verma, 
1986; George Pennacchi, 1987), assuming 
that at the expiration of the option insolvent 
banks are closed. Moreover, for most of its 
50-year history, the insurance system has 
been characterized by low failure rates and 
low payouts—just the opposite of what might 
be expected if banks were maximizing the 
value of the put option successfully. 

Recently, bank and thrift failures and de- 
posit insurance payouts have reached record 
highs (see Chart 1); the FSLIC has liabili- 
ties far in excess of its assets, and even the 
FDIC faces threats to its solvency. Although 
many have argued that these recent prob- 
lems are in part due to the moral hazard of 
deposit insurance, the question is why it has 
taken 50 years for major problems to arise. 

One explanation (Arnold Kling, 1986) is 
that the recent episode simply refiects an 
increasingly risky economy, which in turn 
has increased the risk of bank portfolios. In 
the last few years, whole sectors and regions 
of the national and even the world economy 
have encountered serious downturns that 
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CuartT 1. DEPOSIT INSURANCE EXPENSES PER DOLLAR OF DEPosITs AND 
BANK FAILURES 


have affected the values of bank and thrift 
assets.. Similarly, ‘interest rates have become 
more; volatile, increasing the riskiness of 
banks’, and especially thrifts’, portfolios. 

The» rise in bank and thrift failures in 
recent: years also may reflect the secular 
decline} in capital-to-asset -ratios over the 
past :two-.decades. As Chart 2 shows, both 
market and book capital ratios of the 25 
largest: bank holding companies have fallen 
well below their levels in the mid-1950’s, 
when ‘only a handful of banks and thrifts 
failed each year, as opposed to several hun- 
dred per year recently.. Moreover, beginning 
in about 1974, market values of the 25 
largest bank holding companies in the ag- 
gregate fell below book capital ratios. 

There are two reasons why declining cap- 
ital ratios could lead to an increased rate of 
bank failures. First, lower capital, holding 
asset risk constant, leads to less protection 
against, failure. Second, as shown in Freder- 
ick T; Furlong and Keeley (1987, 1989), 
lower capital ratios increase the incentive 
for banks to increase asset risk. Thus, even 
if overall risk in the economy did not in- 
crease,:.banks would have a greater incen- 
tive ito increase asset portfolio risk due to 
the decline in capital ratios. 


There is little doubt that increased risk in 
the economy and declining capital ratios 
have had a lot to do with the increase in 
bank and especially thrift failures in recent 
years. But these developments do not ex- 
plain why banks and thrifts allowed 
bankruptcy risk to increase. After all, de- 
pository institutions have considerable con- 
trol over the riskiness of their asset portfo- 
lios and perhaps even more control over 
their capital ratios. Thus, these explanations 
beg the question of why capital ratios be- 
haved as they did. 

Specifically, why did banks on average 
hold so much capital during the 1950’s and 
early 1960’s, and why did capital ratios fall 
during the 1960’s and 1970’s? It seems dif- 
ficult to pinpoint any explicit regulatory 
changes that would have made it easier for 
banks to increase default risk, and banks 
had access to. fixed- rate deposit insurance 
throughout the- period.! A similar puzzle 


‘Although the percentage of deposits explicitly cov- 
ered by deposit insurance has increased, in theory, 
even partial deposit insurance coverage provides an 
incentive for benks to minimize capital and maximize 
asset risk (as long as they can share losses with the 
deposit insurance fund). 
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CHART 2. CAPITAL-TO-ASSET RATIOS, MARKET AND Book VALUES 


arises in trying to explain the cross-sectional 
variation in bank capital ratios. Some banks, 
for example, hold much more capital than 
others and much more than regulators re- 
quire. 

This paper argues that one explanation 
for these apparent puzzles involves differ- 
ences and changes in the degree of competi- 
tion faced by banks. In the 1950’s and even 
early 1960’s banks partially were protected 
from competition by a variety of regulatory 
barriers. For example, chartering was very 
restrictive (Sam Peltzman, 1965) until the 
mid-1960’s when James Saxton, then 
comptroller of the currency, greatly liberal- 
ized it (Keeley, 1985a, b). Moreover, some 
banks were protected by various state laws 
that limited or prohibited branching, multi- 
bank holding company, and interstate bank 
expansion. However, these laws have been 
greatly liberalized over the last few years, 
possibly eroding banks’ charter values. Like- 
wise, deposit rate deregulation may have 
diminished charter values by increasing 
competition, especially for institutions in 
protected local markets that had been rely- 
ing on nonprice service competitian to at- 
tract funds. In addition, beginning in the 


early 1980’s, thrifts were given expanded 
powers that enabled them to compete more 
fully with banks. Finally, many argue that 
changes in technology have increased the 
competition that banks face from nonbank 
financial firms, such as investment banks, 
brokerage firms, and insurance companies. 
Such developments as money market mu- 
tual funds, cash management accounts, and 
increased use of commercial paper have all 
made competitive inroads in banks’ tradi- 
tional product markets. _ 

Increased competition may have reduced 
banks’ incentives to act prudently with re- 
gard to risk taking. In fact, the evidence in 
Chart 2 of declining market values (which 
would reflect capitalized charter values) rel- 
ative to book values (which would not) 
suggests that bank charter values were de- 
clining. In the 1950’s and early 1960's, regu- 
latory restrictions on entry and competition 
made bank charters valuable. With valuable 
charters as assets, banks had an incentive 
not to risk failure since the owners of the 
banks cannot sell the charter once the bank 
is declared insolvent. Instead, a bank that 
was insolvent on a book-value basis still had 
a valuable charter that the FDIC could sell 
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in a purchase and assumption (P&A). (In 
calculating primary book capital, intangible 
assets, generally representing the excess of 
the purchase price of assets that had been 
acquired over their book value, are sub- 
tracted from capital.) This may explain why 
P&A’s typically are less costly than liquida- 
tions. Also, banks would apparently be will- 
ing to overpay for deposit insurance if it 
were needed to obtain and maintain a valu- 
able charter. ' 

The possession of a valuable charter thus 
made it difficult for banks to shift losses to 
the FDIC, and its potential loss in essence 
created a regulatory bankruptcy cost from 
the point of view of bank owners. This is 
especially so since regulators focus on book 
value when assessing a bank’s solvency, not 
market value. Thus, the gains from feasible 
increases in risk taking would be offset by 
the diminished expected value of the char- 
ter. As a result, a bank will not have an 
incentive on the margin to increase default 
risk (either through reducing capital relative 
to assets or increasing asset risk) as long as 
the expected lass of the charter exceeds the 
gain to the bank of the enhanced value of 
the deposit insurance put option. Moreover, 
regulation limited the feasible increases in 
risk taking so as to prevent banks from 
potentially imposing losses that would have 
exceeded their charter values. This idea is 
established formally below using a state 
preference model. 

In the empirical analysis below a simulta- 
neous equations model of bank risk taking 
and charter value is developed and esti- 
mated. Changes in the laws governing 
branching, multibank holding company ex- 
pansion, and interstate entry are used to 
identify the model statistically. Over the last 
20 years or so, these anticompetitive laws 
have been liberalized greatly, and there are 
virtually no cases of states increasing their 
stringency (Dean F. Amel and Daniel G. 
Keane, 1987). Although the liberalization of 
these laws is not necessarily the most im- 
portant factor in increasing the degree of 
bank competition, it is an easily observed 
exogenous factor with respect to bank risk 
taking. Thus, changes in these laws over 
time provide an opportunity to examine their 
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influence on market power in banks and 
whether exogenous variations in market 
power are related to bank risk taking. 

This paper first examines the relationship 
between changes in regulatory entry barri- 
ers and the market power of banks in order 
to create an instrument for charter value. 
Then the relationship between market 
power and risk taking is estimated. This 
paper employs James Tobin’s q, as sug- 
gested by Eric Lindenberg and Stephen Ross 
(1981), as a measure of a bank’s market 
power (monopoly rents). Two measures of 
bank risk are then related to exogenous 
variations in q: the market-value capital-to- 
asset ratio and the interest cost on large, 
uninsured CD’s. I find that q appears to be 
a useful proxy for market power and that 
banks with greater market power hold more 
capital and pay lower rates on CD’s. 

The remainder of the paper is organized 
as follows. In Section I, a state preference 
model is employed to show how market 
power can affect bank risk taking and how it 
can be measured. Section II presents the 
empirical results. Finally, Section III con- 
tains a summary and conclusions. 


I. Theoretical Framework 


Below, a state preference model is used 
to develop the major results. The model 
described below closely follows that pre- 
sented in Furlong and Keeley (1989). 
Marcus (1984) has developed similar results 
using an options model, but the state pref- 
erence model clarifies the conditions under 
which a bank can benefit from increasing 
default risk and also illustrates the relation- 
ship between charter value and Tobin’s q. 

A two-period (current and future period), 
two-state model is used where P, and P, 
are the current values of a dollar payment 
in future states 1 and 2, respectively. (Thus, 
the risk-free interest rate is 1/(P, + P,)—1). 


*State preference models have been widely used in 
the analysis of banking and deposit insurance—see 
John H. Kareken and Neil Wallace (1978), William F. 
Sharpe (1978), Uri Dothan and Joseph Williams (1980), 
and Furlong and Keeley (1987, 1989). 
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The state prices are assumed to be exoge- 
nously given. To fund its assets, a bank uses 
an initial capital of C} dollars and issues 
deposits with a current value of D, dollars. 
Initially, the bank is assumed to issue risk- 
free deposits that pay off $1 in each state, 
although this assumption is relaxed later to 
allow for the issuance of risky deposits. Also, 
it is assumed that initially the bank is not 
insured, although this assumption is relaxed 
later too. The bank invests in an asset secu- 
rity A that pays A, dollars in state 1 and A, 
dollars in state 2. 

The current value of the bank’s equity, 
V,, is found by valuing the various cash 
flows at the state prices. It is assumed that 
the bank can acquire security A at a price 
P, and issue deposits at a price P,. Thus, 
the bank can purchase (Co + Do)/ P, units 
of A. The current value of the bank’s eq- 
uity, Vọ, is the value of the cash flaws from 
the assets acquired minus those of the liabil- 
ities issued: 


(1) Va=[(Co + Dy)/Ps|[ P1414 Pr Az] 
—( Do /P1)( P, + P). 


If the bank is competitive in both the 
asset market and the deposit market, then 
P,=P,A,+P,A, and P,=P,+ P}, and 
equation (1) above reduces to 


(2) V= Ch. 


If the bank is not insured, the value of its 
equity is independent of its (ex ante) risk 
taking. The reason is that if the bank were 
to default in state 1 and not meet its 
promised obligations to depositors, the de- 
positors would demand sufficiently higher 
payments in state 2 so as to leave the costs 
unchanged at P}. 

However, with deposit insurance, deposi- 
tors would not demand a higher payment in 
state 2, because the insuring agency would 
pay them the difference between the 
promised obligations and the asset value in 
the event of bankruptcy. But with fixed-rate 
underpriced deposit insurance with a pre- 
mium, assumed for expositional purposes to 
be zero, banks pay less than the promised 
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amount if bankruptcy occurs in state 1, and 
only the promised amount in state 2. If the 
bank were to go bankrupt in state 1, then 
the future value of the bank’s equity in state 
1 is zero, and its current equity value is 


(3) Vo = (Co + Do)(P242)/ P3 
— P Do / Pi > Co- 


The current value of the bank’s equity 
when bankruptcy would occur in state 1 
(that is, when bankruptcy is possible) equals 
the value of the excess of its deposit obliga- 
tions over asset returns (the option value of 
deposit insurance) Jọ, plus its invested capi- 
tal, Co. That is, 


(4) Vo = To + Co 
where 
(5) lo = DoP, /Py (Co + Do) PA, / Py 


and I, > 0. 

As is well known, increasing capital, hold- 
ing constant deposits, reduces the value of 
the deposit insurance option, and hence eq- 
uity, and increasing asset risk (increasing 
the payment in state 2 while reducing that 
in state 1 so as to hold the price of the asset 
constant) increases the option value.’ Thus, 
the problem facing bank regulators is to 
constrain banks’ incentives to reduce capital 
relative to assets and to increase asset risk. 
The puzzle is why regulators apparently 
succeeded throughout much of the last 50 
years but in recent years apparently have 
failed. 


*Since p> 0, the value of the bank exceeds its 
initial capital investment, Thus, 


diy /dCy|Dy = ~ PA, /P4 <0 
and 
dlo / dA2\P, = — (Co + Da)( Py /P4) (dA; /dA2) > 0 
since 


dA, /dA, <0. 
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A. Charter Values 


If banks can operate only with charters 
that are limited:in supply, banks may be 
able to acquire assets at below-market prices 
(that is, bank loans would earn higher risk- 
adjusted rates than would market securities) 
and/or they may be able to make below- 
market-value payments on deposits (that is, 
deposits would pay below the risk-adjusted 
rate). Bank charters have been made valu- 
able by limiting their supply and by protect- 
ing banks through various regulations that 
limited interbank competition as well as 
competition by nonbank firms. 

In the model below, banks are assumed 
to be insured (at zero cost)* but face peri- 
odic examinations. At the end of the period, 
if the bank is insolvent (that is, its assets, 
not including the charter value, are less 
than deposit obligations), equity holders re- 
ceive nothing, depositors receive their 
promised obligations, and the insurance 
agency receives the bank’s assets, including 
the charter. If the bank is solvent, however, 
the bank retains its charter value and con- 
tinues to operate for another period. 

If a bank chooses capital and asset risk so 
that it will not default in either state, the 
current value of the bank’s equity is 


(6) Vo=(Cy+ Do)( PA, + Pp Az) /Py 


— Do( Py + P2)/Pa t+ Xo 
where 


(7) Xo = PX, + PAX 


in which P,X, is the current value of the 
charter to operate one more period if state 
1 occurs, and P, X, is the current value of 
the charter to operate one more period if 
state 2 occurs. Thus, the bank must balance 
the gains from increased risk taking (Jọ) 
with the loss of the charter value if bank- 
ruptcy occurs (P,X,). The bank will risk 


‘Similar results are obtained if deposit insurance 
has a positive cost not related to default risk. The 
assumption of zero cost is employed to simplify the 
analysis. 
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bankruptcy only if 
(8) lj> PiX 


Consider a bank just on the verge of 
insolvency in state 1 (that is, when a 
marginal increase in asset risk or reduction 
in capital would cause bankruptcy). The 
value of such a bank is 


(9) Vo = (Co + Do)(P242)/P4 
= Do P, / Pait Xo- 

However, 

(10) aV,/dD,|C, =- PX, <0. 


That is, a marginal increase in deposits 
holding capital constant, which causes 
bankruptcy in state 1, in turn causes the 
bank to lose the value of its charter if state 
1 occurs, P,X,. Similarly, 


(11) dV, /dA5|P4,Cy = — PX, <0. 


Thus, a bank initially at a position where 
solvency is guaranteed in both states will 
not have an incentive on the margin to 
increase risk either through increases in 
leverage (that is, increases in deposits hold- 
ing capital constant) or increases in asset 
risk. This remains true throughout the re- 
gion where P,X,> Ip. 

Although valuable bank charters do not 
obviate the need for bank regulation be- 
cause J, is unbounded in the absence of 


>Marcus (1984) shows that dV, /dD, can be posi- 
tive for banks with low charter values and negative for 
banks with high charter values using an options pricing 
formula. Although he argues that dV, /do (the equiv- 
alent of dV) /dA,) can be negative, his figure 2 implies 
that, for banks with high charter values, dV) /do is 
positive. Moreover, in his model, it is unclear what 
determines the critical value of whether a marginal 
increase in default risk will be beneficial. 

In contrast, the state preference model shows that 
the choice is one of balancing the gains in the option 
value of deposit insurance with loss of the expected 
value of the charter. Marginal increases in default risk 
will not benefit the bank until default risk is sufficiently 
high so that Jy = PX. 
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regulation, they make the regulator’s job 
much easier. If a bank’s capital and asset 
risk were initially set so that solvency were 
assured in both states, a large discrete in- 
crease in asset risk or reduction in capital 
sufficient to make [, > P,X, would be re- 
quired if the value of the bank’s equity were 
to increase. Since such large discrete 
changes presumably would be easy to de- 
tect, banks would be discouraged from try- 
ing to increase default risk in the first place, 
and regulators would not need to be con- 
cerned with small changes in asset risk or 
capital. 


B. Market Power 


An uninsured bank that has market (mo- 
nopoly) power in its asset market can make 
positive net present value loans. That is, the 
loans’ future payoffs, when valued at the 


exogenously given state prices, exceed their ` 


` current cost. For a such a bank, 


(12) 


However, the bank does not face an inex- 
_ haustible supply of such loan opportunities, 
and as assets A, (which ecual C)+ Dy) 
increase, £, the ratio of cash flows from an 
asset to its price diminishes. (That is, £ = 
e(A,) and e’<0.) Assuming such a bank 
maximizes. its net-of-capital investment 
value, it will expand until the current value 
of its marginal revenue equals the marginal 
cost of its deposits (which is 1). That is, 
e(1+n)=1, where n is the elasticity of e 
with respect to Ay. (For an uninsured bank, 
this condition holds regardless of whether 
bankruptcy occurs, assuming no bankruptcy 
costs, since the costs of depcsits are unaf- 
fected by the risk of bankruptcy.) Similar 
conditions hold for market power on se 
deposit side. 

For banks that have market power in 
either the asset or deposit market, as sug- 
gested originally by George Stigler (1964) 
and later and more formally by Lindenberg 


(P,A,+P,A,)/P,=6e>1. 


6For a bank with market power on the deposit side, 
(Pi + P))/Pp=f <1. 
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and Ross (1981), Tobin’s g is an ideal mea- 
sure of market power. In this paper, Tobin’s 
q is defined as the current market value of a 
firm’s assets (the market value of its equity 
plus debt) divided by their current cost to a 
firm. The reason that q is an ideal measure 
of monopoly rents is that the capitalized 
value of such rents, whether they arise from 
market power in the asset market, deposit 
market, or both, will’be reflected in the 
market value of the firm’s equity, and thus 
assets, but not in the costs of acquired as- 
sets. The reasons for q’s superiority as a 
measure of market power are spelled out in 
more detail in Michael Salinger (1984) and 
Michael Smirlock et al. (1984). To see why 
Tobin’s q is a measure of market power in 
the above model, note that q is given by 


i Co+ Dy 


(The terms in brackets [ ] represent the 
market value of the bank’s equity to which 
the current value of debt Dy, is added.) 
Equation (13) shows that q is equal to the 
current plus future degree of market power 
as reflected in current and future g. For a 
competitive firm, £ = 1 and X, = 0, but for a 
firm with market power, as discussed above, 
€>1 and X,>0. Thus, an uninsured bank 
with no market power in either the asset or 
deposit market would have a q of 1.’ 


II. Empirical Evidence 


The data used to estimate the model are 
from several sources. The bank holding 
company data are from the Compustat bank 


7For a bank with market power on the deposit side 
as well, 


Po (1-f) žo h <1 
= g -+ ————(1-— f)+ ————__ where 
ae oe oa f< 
is inversely related to the degree of market power on 
the deposit side (see footnote 6). Thus, a bank with 
market power on the deposit side also will have a g 


greater than one. 
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tapes, which contain balance sheets, income 
statements, and monthly stock prices for 
the 150 largest bank holding companies 
(BHC’s). Although this sample is not repre- 
sentative of the entire population of all 
banks or bank holding companies, which 
comprises many smaller and often privately 
held organizations, the BHC’s in this sam- 
ple hold about 40 percent of all bank assets 
and thus are of interest in their own right. 
Data on the interest cost of large, uninsured 
CD’s are from the Bank Consolidated Re- 
port of Conditions and Income (the Bank 
Call Report). Data on state branching, 
multibank holding company expansion, and 
interstate entry laws are from Amel and 
Keane (1987) and various Federal Reserve 
Annual Statistical Digests. 


A. Measuring Market Power 


As discussed above, q is used to measure 
the degree of market power in banking. The 
measure q is defined as the market value of 
assets (calculated as the sum of the market 
value of common equity—price per share 
times number of shares—and the book value 
of liabilities) divided by the book value of 
assets. The assumption is that the capital- 
ized value of the bank charter will be re- 
flected in the market value of equity (and 
thus the market value of assets as defined 
above), but not the book value of equity or 
assets.” Thus, banks with larger charter val- 


ŝ Banks use an accounting convention in which loan 
loss reserves are counted as book capital. However, 
since loan loss reserves are often taken only when asset 
losses either have already occurred or when they are 
anticipated, they in fact often do not represent book 
capital since the addition to capital usually would be 
offset at least fully by asset losses if they were realized 
on the books. Thus, in this paper, loan loss reserves are 
not counted as book capital. (This procedure follows 
generally accepted accounting procedures as followed 
by bank holding companies in their 10-K and 10-Q 
reports filed with the Securities and Exchange Com- 
mission.) 

"Although a firm that acquired a bank originally 
endowed with a valuable charter would record that 
charter’s value (i.e., the excess of the purchase price 
over the book value) as an intangible asset, intangible 
assets are not counted as primary book equity, the 
measure of book equity used in this paper. 
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ues due to market power in asset and/or 
deposit markets should have greater mar- 
ket-to-book asset ratios. Note that the abil- 
ity to issue deposits at below-market rates is 
an asset, the value of which will be reflected 
in the market value of the bank’s equity and 
thus the market value of assets as I define 
them. 

Several difficulties arise in using q as a 
measure of a bank’s market power. First, 
the book value of assets represents the his- 
torical costs of assets acquired and sold over 
time, not the current costs of the assets. 
Thus, ex post market-to-book ratios that 
differ from 1 may reflect different asset re- 
turn realizations rather than the degree of 
market power, which would be reflected in 
the ex ante q. Thus, the theoretically appro- 
priate ex ante q is measured with error 
when using the ex post q. Another reason 
that g may not accurately reflect the degree 
of market power is that the value of poten- 
tially underpriced deposit insurance could _ 
be capitalized into a bank’s market value.!” 
Several methods are used in the empirical 
analysis to control for these possibilities. 

First, and most importantly, simultaneous 
equations techniques are used. In the first 
stage, an instrument is created for q, and 
then the predicted ratio is used as an ex- 
planatory variable in second-stage equations 
of bank risk taking. Thus, the empirical 
model allows for both the possible 
endogeneity of q and the fact that q is 
measured with error. 

Second, a sample of banking organiza- 
tions is selected to have similar histories. To 
do this, the sample is restricted to banking 
organizations for which data are continu- 
ously available from 1970 through 1986. 
Moreover, two variables are included to 
contro! for different asset histories: a dummy 
variable for banks that were on the Compu- 
stat tapes in 1964 and survived to be in- 
cluded in the sample (about 38 percent of 


Cas Smirlock and Gilligan (1984) argue, a'g greater 
than 1 also could reflect the capitalized value of a 
firm-specific efficiency enhancing factor of production. 
However, a firm possessing such a factor would have 
the same incentives to protect its value as would a firm 
possessing market power. 
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the sample) and each bank’s asset growth 
rate since 1970, 


B. Model Structure 


The empirical model consisis of two sets 
of equations. In the first set of regressions, 
q is regressed on dummy variables that equal 
1 during a given period if there was a liber- 
alization in the laws governing state branch- 
ing, multibank holding company, and inter- 
state expansion, respectively, during any 
previous period. These regressions also con- 
tain a set of control variables and other 
proxies for market power such as the ratio 
of demand deposits to total deposits, the 
ratio of foreign deposits to total deposits, 
and the ratio of loans to assets. (The 
branching variables are constructed based 
on data in Amel and Keane [1987] and 
various Federal Reserve Statistical Ab- 
stracts.) The balance sheet data are from 
Compustat and refer to the consolidated 
holding company. 

The hypotheses are that unanticipated 
liberalization of legal entry barriers should 
erode banks’ market power and thus neg- 
atively affect q and that greater deposit 
funding and loan making also might be re- 
lated to market power and thus positively 
affect q. 

In a second set of regressions, both actual 
q and an instrument for q are explanatory 
variables in equations that attempt to ex- 
plain bank risk taking. The hypotheses are 
that the banks with greater market power 
should have larger capital-to-asset ratios and 
lower risks of default. 

The system to be estimated can be repre- 
sented as 


(14) Gin = X44 By + Eii 
(15) risk; = Xz f+ Gin Bs + Ezi 
where 


Xix 18 a vector of branching, financial, 
control, and other variables for bank i at 
time f¢; - 

X>, is a vector of financial and other con- 
trol variables at time f; 
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q; ìs the bank’s market-to-book asset ratio 
at time ¢ [in the two-stage least-squares 
estimates, equation (14) is used to con- 
struct an instrument for q;,]; 

risk, is a measure of bank default risk 
(two measures are used: the market-value 
capital-to-asset ratio [the market value of 
common equity divided by the market 
value of equity plus the book value of 
liabilities] and the interest cost on large 
CD’s); 

B, and B, are vectors of coefficients; 

B, is the effect of q on risk; and 

Ej; and €z; are random error terms. 


C. Empirical Results 


Table 1 contains descriptive summary 
statistics for the bank holding companies in 
the sample using fourth-quarter (year-end) 
data for the 1970-86 period." It is interest- 
ing to note that market-to-book asset ratios 
range from 0.95 to 1.18 and that market- 
value capital-to-asset ratios range from 
0.0075 to 0.21. Although it is not shown in 
the table, about 32 percent of the bank 
holding companies were in states that liber- 
alized branching laws, 44 percent were in 
states that liberalized multibank holding 
company expansion laws, and 78 percent 
were in states that liberalized interstate en- 
try laws by 1986. 


D. Market-to-Book Asset Ratios 


Table 2 contains the results of the esti- 
mates of equation (14) over the 1971-86 
period using fourth-quarter data. Estimates 
of the effects of four different types of vari- 
ables are reported: branching variables, 
control variables, balance sheet variables, 
and financial variables. The branching vari- 
ables are dummies reflecting the liberaliza- 
tion during a previous period of branching, 


Some of the data items were missing on the Com- 
pustat tapes for particular bank holding companies in 
particular quarters. Rather than exclude the entire 
observation, missing values were forecast on the basis 
of quadratic time-trend OLS regressions estimated 
separately for each bank. For variables known a priori 
to be nonnegative, forecast values were constrained to 
be nonnegative. 
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TABLE 1—SUMMARY STATISTICS FoR &5 Lance BANK HOLDING COMPANIES 
(PooLeD 1970-86, FouRTH-QUARTER Data) 


Characteristic 


Liberalization of Branching Law 
(Dummy Variable) 
Liberalization of Multibank Holding Co. Law 
(Dummy Variable) 
Liberalization of Interstate Entry Law 
(Dummy Variable) 
Book Value of Assets 
(Net of Loan Loss Reserves in $ Millions) 
Market-to-Book Asset Ratio, g 
Multinational Regulatory Status 
(Dummy Variable) 
Foreign Deposits /Total Deposits 
(Cash + Treasury Securities)/Total Assets 
Annual Assets Growth Rate Since 1970 
Demand Deposits /Total Deposits 
Market-Value Capital-to-Asset Ratio 
Book-Value Capital-to-Asset Ratio 
New York Composite Index 
3-Month Treasury-Bill Rate 
20-Year Treasury-Bond Rate 
Average Maturity of CD’s (Months) 
Interest Cost of CD's 


multibank holding company, and interstate 
expansion laws in the state in which the 
bank is located. The financial variables are 
the New York Composite Index, the three- 
month Treasury-bill rate, and the 20-year 
Treasury-bond rate. They are included to 
control for the effects of general interest 
rate and stock market trends on the market- 
to-book ratio that would not be related to 
changes in market power. 

Other variables are included as proxies 
for market power. While some of them 
might be endogenous, the coefficient esti- 
mates of branching variables are not sen- 
sitive to their inclusion. The estimated 
coefficients generally conform with a priori 
expectations. Liberalization of branching or 
multibank holding company expansion laws 
in a previous period is associated with a 
statistically significantly (at the 1-percent 
level) lower market-to-book asset ratio. This 
suggests that both branching and multibank 
holding company expansion restrictions do 
provide banks a degree of protection from 
competition. This finding is consistent with 
a study by Mark J. Flannery (1984) that 
finds that unit banks in unit banking states 
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Mean Minimum Maximum 
0.19 0 1 
0.26 - 0 1 
0.12 0 1 
$10,587 $278 $195,147 
1.00 0.95 1.18 
0.19 E i 1 
14 percent 0 percent. 89 percent 
24 percent 7.8 percent 50.1 percent 
12 percent — 1 percent 63 percent 
35.9 percent 0 percent 72.3 percent 
0.056 0.0075 0.21 
0.055 0.010 0.14 
70.8 35.4 142.1 
7.64 percent 4.02 percent 15.66 percent 
9.03 percent 5.96 percent 13.73 percent 
6.45 1.19 19.81 


0.085 0.050 -0.13 


earn monopoly profits approximately 20 
percent above those reported by similar 
banks in branching states, as well as other 
studies that have found that branching and 
multibank holding company expansion re- 
strictions lead to higher loan rates and lower 
deposit rates.'* (See Allen N. Berger and 
Timothy H. Hannan [1987] for recent evi- 
dence that restricted branching leads to 
lower deposit rates.) However, no signifi- 
cant effect of liberalization of interstate en- 
try restrictions is found. This may reflect the 
fact that these laws generally allow entry 
only by acquisition (which would increase 
market prices) and do not allow de novo 
entry, which would directly increase compe- 
tition and thus diminish market prices. 


There is an extensive literature on the effects of 
branching restrictions on competition. Generally, these 
studies show thet branching restrictions are associated 
with reduced competition. For example, Donald T. 
Savage and Stephen A. Rhoades (1979) found banks in 
statewide branching states paid higher rates on de- 
posits. Also, a survey by George J. Benston (1973) finds 
general gains in service for the banking public. See 
Savage and Elinor H. Solomon (1980) for a discussion 
of the literature. 
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TABLE 2—PooLEeD TIME-SERIES CROSS-SECTION REGRESSION FOR 85 LARGE BANK 
Hotpinc COMPANIES RELATING THE MARKET-TO-Booxk ASSET RATIO TO VARIOUS 
DETERMINANTS OF MARKET Power 1971-86, FouRTH-QUARTER DATA 

(STANDARD ERRORS IN PARENTHESES) 


R2 

Number of Observations 

Dependent Variable Mean 
(Market-to-Book Asset Ratio, q) 


Intercept 


Branching Variables 
Liberalization of Branching Law 


Liberalization of Muitibank Holding Co. Law 


Liberalization of Interstate Entry Law 


Control Variables 


Dummy for Being on Compustat in 1964 


Dummy for Multinational Status 


Balance Sheet Variables 
Book-Value Asset Growth Since 1970 


Demand Deposits /Total Deposits (x 100) 


Loans/Total Assets 


Foreign Deposits /Total Deposits (x 100) 


Book Value of Assets 


(Cash and Treasury Securities)/Total Assets 


Financial Variables 
N.Y. Composite Index 


3-Month Treasury-Bill Rate 


20-Year Treasury-Bond Rate 


*Significant at the 10-percent level. 


**Significant at the 5-percent level. 
***Significant at the 1-percent level. 


The balance sheet variables zenerally have 
signs consistent with the notion that market 
power arises in deposit and loan markets, 
although statistically significant effects are 
found only for deposit markets. The frac- 
tion of demand and the fraction of foreign 
deposits in total deposits are positively and 
significantly related to the market-to-book 
asset ratio. The point estimate of the effect 
of the fraction of loans in total assets on the 


0.42 
1360 
1.00 


—0.94"** 
(0.011) 


—0.0046*** 

(0.0017) 

eE 0.0074* x k 
(0.0015) 
0.00050 
(0.0024) 


0.0054*** 
(0.0013) 

—0.0053** 
(0.0023) 


0.21*** 
(0.010) 
0.00056*** 
(0.000067) 
0.0098 
(0.012) 
0.00019*** 
(0.000045) 

—4.09E~? 
(4.49B 78) 

— 0.000027 
(0.00013) 


0.0038*** 
(0.000030) 
— 0.00078 ** 
(0.00032) 
—0.0019*** 
(0.00047) 


market-to-book ratio is positive, and the 
estimate of the effect of the ratio of cash 
and treasury securities to total assets is neg- 
ative, although neither variable is signifi- 
cant. Banks with more rapid asset growth in 
the past have significantly higher market- 
to-book ratios, perhaps because more rapid 
growth is associated with lack of competi- 
tion or success due to other factors. Asset 
size per se, however, is not significant. 
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The control variable, whether a bank was 
on Compustat in 1964, is positively and sig- 
nificantly related to the market-to-book ra- 
tio, perhaps indicating that older, promi- 
nent bank holding companies that survived 
to be included in the sample have higher 
market values than bank holding companies 
that entered the sample later. The variable 
for multinational status is negatively related 
to the market-to-book ratio, which might be 
due to the very competitive international 
environment in which these 16 money cen- 
ter banks (as defined by the Federal Re- 
serve) operate. 

Finally, the effects of the financial vari- 
ables are much as one might expect. Stock 
market values are positively related to the 
market-to-book ratio, and interest rates are 
negatively related. 

Overall, the high correspondence be- 
tween the expected effects of the variables 
and their estimated effects suggests that the 
market-to-book ratio is in fact a proxy for 
market power. Next, I test whether this 
proxy for market power is negatively related 
to bank risk taking. 


E. Bank Risk 


Below, the effects of the market-to-book 
asset ratio on two measures of bank default 
risk are examined. As discussed above, a 
key hypothesis is that the decline in banks’ 
market power, as proxied by their market- 
to-book ratios, was a primary cause of the 
decline in banks’ capital-to-asset ratios. 
Moreover, the theory implies that the 
cross-sectional variation in bank capital ra- 
tios would be influenced by variations in 
market power—banks with greater market 
power should have higher capital ratios. 


F. Market-Vaiue Capital-to-Asset Ratios 


Table 3 presents coefficient estimates of 
equation (15), in which the market-value 
capital-to-asset ratio (i.e., the market value 
of capital divided by the market value of 
assets, defined as the market value of equity 
divided by the market value of equity plus 
the book value of liabilities) is regressed on 
the market-to-book asset ratio, q, holding 
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stock market and interest rate trends con- 
stant. It is necessary to hold stock market 
and interest rate trends constant since these 
trends potentially could influence both the 
dependent and independent variables, thus 
leading to spurious correlation. In addition, 
dummies for being on Compustat in 1964 
and for multinational status are included. 

In the first column of Table 3, ordinary 
least squares (OLS) estimates from a pooled 
time-series cross-section regression are re- 
ported. The OLS results suggest a strong, 
positive, and statistically significant relation- 
ship between the proxy for market power, 
the market-to-book asset ratio, and the 
market-value capital-to-asset ratio. Thus, as 
predicted, banks with more market power 
appear to hold more capital relative to as- 
sets. Moreover, the estimated magnitude of 
the effect is large, with a 10 percentage 
point increment in the market-to-book asset 
ratio leading to a 0.09 increase in the mar- 
ket-value capital-to-asset ratio, and is not 
sensitive to whether the variation in the 
market-to-book ratio is due to changes over 
time or differences across banks.” 

There are, however, several reasons why 
the OLS estimates should be viewed with 
caution. First, endogeneity between q and 
bank risk is possible. For example, a bank 


To assess how sensitive the results were to pooling 
the cross sections over time, separate cross section 
regressions were run for each year from 1971 to 1986. 
Each of the OLS point estimates of the effect of the 
market-to-book asset ratio on the market-value capital- 
to-asset ratio were significantly different from zero at 
the 1-percent level and ranged from 0.62 to 1.19, 
approximately the same magnitude as the pooled 
time-series cross section results reported in Table 3. 

Since different banks can have different responses 
to the market index (as proxied by the New York 
Composite), I also estimated an unconstrained version 
of equation (15) with separate intercepts and separate 
slope coefficients for the New York Composite Index 
for all 85 bank holding companies. However, the esti- 
mate of the effect of the market-to-book ratio was 
statistically significant and about the same magnitude 
as in the constrained model estimates reported in 
Table 3. Thus, tke results appear robust regarding the 
source of variation in the market-to-book capital ratio 
—both the time-series and cross-sectional variation in 
banks’ market-value capital-to-asset ratios are posi- 
tively associated with time-series and cross-sectional 
variation in banks’ market-to-book capital ratios. 
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TABLE 3—POOLED TIME-SERIES CROSS SECTION REGRESSION RELATING THE MARKET-VALUE 
CAPITAL-TO-ASSET RATIO TO THE MARKET-TO-BOOK AsSET RATIO AS A 
Proxy FoR MARKET Power, FourTH-QUARTER Data, 1971-86 
{STANDARD ERRORS IN PARENTHESES) - 











OLS "=  TSLS? TSLS? 
R? 0.83 0.66 0.15 
Number of Observations 1360 1360 1360 
Dependent Variable Mean 0.054 0.054 0.054 
(Market-Value Capital-to-Asset Ratio) 
Intercept =(),657** —0.70** — 1.00*** 
(0.014) (0.026) (0.29) 
N.Y. Composite Index . ~ 0.0000044 0.000032** ~ 0.000052 
(0.000013) (0.000015) (0.000029) 
3-Month Treasury-Bill Rate ~ 0,00042** . —(),00043** ~~ 0.00042 
(0.00018) (0.00018) (0.00051) 
20-Year Treasury-Note Rate ~ 0.00023 ~- 0.00091*** 0.00065 
(0.00025) (0.00028) (0.0015) 
Dummy for Being on Compustat in 1965 0.0018** 0.0021 ~ 0.054 
(0.00073) (0.00076) (0.046) 
Dummy for Multinational Status ~Q.018*** ~0.018*** — 0).032** 
(0.00094) (0.00098) (0.015) 
Market-ta-Book Asset Ratio (q) 0.91*** — 0,.77*** 1.09%** 
(0.013) (0.025) (0.27) 


*Significant at the 10-percent level. 
**Significant at the 5-percent level. 
“**Significant at the 1-percent level. 


Includes as instruments all variables on right-hand side of regression reported in Table 2. 
»Includes as instruments only the branching, multibank holding company, and interstate expansion dummies, the 
financial variables, the on-Compustat dummy, and the dummy for multinational status. 


with greater default risk could have a greater 
market-to-book asset ratio if deposit insur- 
ance were underpriced and its value were 
capitalized in a bank’s market (but not book) 
value. Second, the market-to--ook asset ra- 
tio measures market power with error due 
to ex post asset return realizations that are 
different from ex ante expectations, Third, q 
and the market-value capital-to-asset ratio 
might be spuriously correlated due to the 
presence of the market value of the bank’s 
equity on both sides of the equation. Al- 
though the equation is not an identity, to 
the extent the ratio of the book value of 
liabilities to the book value of assets were 
approximately constant or much less vari- 
able than the ratio of the market value of 
equity to the book value assets, the esti- 
mated OLS coefficient on g would be bi- 
ased toward 1. For these reasons, a simulta- 
neous equations model is emploved, and 
two-stage least.squares (TSLS) estimates 
also are displayed in Table 3. By employing 


TSLS techniques, exogenous variation in g 
is related to actual variation in market-value 
capital-to-asset ratios thus avoiding the po- 
tential endogeneity and measurement error 
problems associated with q- TSLS tech- 
niques also solve the problem of potential 
spurious correlation since the actual market 
value of common equity is not a right- 
hand-side variable. (The method of estima- 
tion employed produces: standard errors 
corrected for the two-step nature of: the 
procedure.) 

In the first-column -TSLS estimates, the 
instruments used to predict the market-to- 
book capital ratio include all of the explana- 
tory variables in equation (14). In the sec- 
ond-column. TSLS estimates, only the 
branching, multibank holding company and 
interstate expansion dummies, the financial 
variables, the on-Compustat dummy, and 
the multinational dummy were included as 
instruments, variables believed to be exoge- 
nous. ‘Thus, in these second-column esti- 
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mates, the exogenous variation in q is due 
mainly to exogenous variation in the 
branching variables. However, the TSLS. es- 
timates of the effect of the market-to-book 
capital ratio on the market-value capital- 
to-asset: ratio are very similar to’ the OLS 
estimate, especially the second TSLS esti- 
mate. This may be because the biases due 
to measurement error and endogeneity due 
to the capitalization of underpriced deposit 
insurance are offsetting. Thus, the finding 
that banks with more market power hold 
more capital relative to assets is robust with 
respect to the estimation method and speci- 
fication of the model. 

' Moreover, the decline over time in banks’ 
market-value capital-to-asset ratios is 
strongly associated with the decline in their 
market-to-book asset ratios. Chart 3 shows 
that the mean of banks’ market-value capi- 
tal-to-asset ratios follows the mean of their 
market-to-book ratios over time; According 
to the theory developed in Furlong and 
Keeley (1987, 1989); banks with more capi- 
tal have less incentive to increase asset risk. 
Thus, as long as the stringency of asset risk 
regulation is not less at banks with stronger 
capital positions, such banks should have 


lower default rates snd thus represent lower 
risk exposures to the FDIC.4 In the next 


_ section, I present a test. of this hypothesis, 


proxying risk exposure with the interest cost 
on large, uninsured CD’s. 


G. -Interest Cost of Large CD’s 


Ideally, ome would. like to relate market 
power in banking directly to the risk expo- 
sure of the FDIC. and a bank’s uninsured 
depositors. While the FDIC's risk’ exposure 
is not directly observable,» one can obtain a 


TAs Furlong and Keeley (1987a, 1989) show, the 
incentive to increase asset risk declines as leverage 
declines because the second derivative of the value of 
the insurance put option with respect to asset risk with 
respect to leverage is positive. This result holds in a 
multistate state preference or a continuous option pric- 
ing framework, but it does not hold for the simple 
two-state model presented in this paper. 

ISAs Marcus and Shaked (1984) and Ronn and 
Verma (1986) have shown, it is possible to estimate the 
FDIC’s risk exposure by using an options pricing model 
that relates the observed risk of bank equity (with 
deposit insurance) to the unobserved risk of bank as- 
sets (absént deposit insurance). However, such esti- 
mates require assumptions about bank closure policy, 
which bank and bank holding company ‘liabilities are 
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measure of the risk premium on a bank’s 
uninsured deposits, assuming that unin- 
sured depositors behave as if they are not 
implicitly fully insured. The hypothesis 1s 
that the rate on large (over $100,000) cer- 
tificates of deposit (CD’s) contain a risk 
premium related to the bank’s default risk, 
which should be negatively related to a 
bank’s market power as reflected in its mar- 
ket-to-book asset ratio, q. 

Although there is a debate regarding 
whether, in fact, large CD’s are implicitly 
fully insured (and thus whether they contain 
a risk premium), recent empirical evidence 
in Timothy H. Hannan and Gerald Han- 
weck (1988) and Christopher James (1987) 
strongly suggests that they do contain a risk 
premium. Moreover, if large CD’s are sold 
in a national (or international) market, it is 
hard to imagine other facters that could 
explain the rate differences among banks on 
CD’s of identical maturity. 

The interest cost of large CD’s is esti- 
mated from information contained in the 
Bank Consolidated Report of Condition and 
Income (the Bank Call Report). The aver- 
age rate on large CD’s is estimated by divid- 
ing the total interest paid (by all of the 
banks in the bank holding company) by the 
average dollar value outstanding during the 
year. Because of difficulties in constructing 
consolidated interest costs and amounts 
outstanding for all of the 85 bank holding 
companies in the previous sample, the sam- 
ple had to be restricted to 77 bank holding 
companies for which complete information 
could be obtained. 

Following James (1987), a weighted aver- 
age maturity of CD’s also is constructed to 
control for differences in the maturities of 
CD’s outstanding. Since the data needed to 
construct this variable have only been col- 
lected since 1984, only the 1934—86 data are 


insured, and a number of other factors (such as the 
market value of preferred stock, whether bank holding 
company assets will be used to support a failing bank, 
etc.). Because of uncertainty regarding just what as- 
sumptions might be appropriate and possible changes 
in a number of these factors over time, I focus on CD 
rates, a more directly observable proxy for bank default 
risk. 
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used. Following James, I also control for 
time-series variation in the level of interest 
rates by including the average yield on 
three-month Treasury bills in the CD rate 
regressions. 

The regression results are reported in 
Table 4. In the first column, estimates of 
the effect of the market-value capital-to- 
asset ratio on the CD rates are presented. 
As mentioned above, banks with more mar- 
ket power, as reflected in higher market-to- 
book asset ratios, hold more capital relative 
to assets, in theory, in order to protect their 
valuable charters. If this theory is, correct, 
banks with more capital relative to assets 
should have lower default probabilities and 
thus should have lower CD rates. The re- 
sults in the first column of Table 4 confirm 
this hypothesis: banks with greater market- 
value capital-to-asset ratios pay lower CD 
rates. In fact, a 1 percentage point increase 
in a bank’s capital-to-asset ratio would lower 
its CD rate by 14 basis points.’© This result 
is somewhat larger but comparable to the 
8-basis-point effect Hannan and Hanweck 
(1988) found using the book-value capital- 
to-asset rates in a somewhat different speci- 
fication. 

In the second and third columns of Table 
4, estimates of the effects of the market-to- 
book asset ratio on the CD rate are pre- 
sented. As in the previous equations re- 
ported in Table 3, both OLS and TSLS 
results are presented. The TSLS results are 
obtained by using all of the right-hand-side 
variables in equation (14) (less one interest 
rate trend variable, due to degrees of free- 
dom limitations) as instruments. The results 
confirm the hypothesis that banks with more 
market power have lower default risk. 

As expected, the coefficient on the mar- 
ket-to-book asset ratio is negative and sig- 
nificantly different from zero at the 1-per- 
cent level. Moreover, the effect, while not 
large, is economically meaningful (each 
1 percentage point increment in the market- 
to-book asset ratio reduces the average CD 


‘This result is also consistent with the hypothesis 
that banks with larger capital-to-asset ratios have less 
incentive to increase asset risk. 
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TABLE 4— POOLED TIME-SERIES Cross SECTION REGRESSION RELATING INTEREST Cost oF LARGE CD's 
‚= TO THE MARKET-TO-BOOK ASSET RATIO AS. A PROXY FOR MARKET POWER AND THE MARKET-VALUE 
CapiITAL-TO-ASsET RATIO, FOURTH- QUARTER Dara, 1984-86 (77 BANK HoLDING COMPANIES) - 
(STANDARD ERRORS IN PARENTHESES} 


R2 i, 
Number of Observations 
Dependent Variable Mean 
(Interest Cost of CD’s Divided = CD’s Outstanding) 


nieri 

3-Month Treasury-Bill Rate 
Average Maturity of CD’s 
Market-to-Book Asset Ratio (q) 


Market-Value Capital-to-Asset Ratio 


*Significant at the 10-percent level. 
**Significant at the 5-percent level. 
*** Significant at the 1-percent level. 


cost by 16-18 basis points). It is important 
to recognize that this estimated effect arises 
from both. the time-series as well as the 
cross-sectional variation in CD costs. This 
result is consistent. with the. significant posi- 
tive effect of the market-to-book ratio on 
the. market-value capital-to-asset ratio re- 
ported in Table 3 and the significant nega- 
tive relationship between the market-value 
capital-to-asset ratio and the CD risk pre- 
mium reported in column one of Table 4. 


HI. Summary. and Conclusion 


“This paper addresses two major empirical 
puzzles. Why has the deposit insurance sys- 
tem worked as well as it has over much of 
its history even though it provides a moral 
hazard for excessive risk taking, and why is 
the cross-sectional distribution of bank risk 
taking nonuniform? The hypothesis is that 
various anticompetitive restrictions en- 
dowed banks with market power and made 
banking charters valuable. The potential loss 
of a charter in the event of bankruptcy 
created, in effect, a regulatory bankruptcy 
cost, which counterbalanced the incentive 


OLS me OLS TSLS 
0.43 0.43 0.39 
231 231 231 
0.085 0.085 0.085 
0.02S*** 0.020* ** 0.19*** 
(0.0064) (0.0043) (0.070) 
0.83*** 0.82*** 0.81*** 
(0.074) (0.074) (0.084) 
0.0011* ** 0.0011*** - 0.00073 
. (0.00025) (0.00025) (0,00055) 
—0.18*** —0.16** 
i (0.042) (0.066) 
AE 0.1 4** * 
+. (0.034) 


for excessive risk taking due to one rate 
deposit insurance.: 

The empirical results are consistent with 
this hypothesis. Banks: with more market 
power, as reflected in larger market-to-book 
asset ratios, hold more capital relative to 
assets (on a market-value basis) and they 
have a lower default. risk as reflected in 
lower risk premiums on large, uninsured 
CD’s. Thus, at least some of the increase in 
bank and thrift failures and payouts from 
the deposit insurance funds may be due toa 
general decline in the value of:bank char- 
ters associated with increased competition 
within the banking. and financial service in- 
dustry. 

‘In the past, the perverse incentives cre- 
ated by the deposit insurance, system were 
countervailed by the potential loss of a valu- 
able charter that induced banks to limit 
their own risk taking. This does not mean 
that it is desirable or even possible to return 
to a system of anticompetitive restrictions in 
order to- reduce banking risk. But it does 
mean that the deposit insurance system must 
be reformed to reduce the rewards it pro- 
vides for excessive risk taking. 
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Cverdrafts and the Demand for Money 


By AVNER BAR-ILAN” 


This paper resents a stochastic analysis of the demand for interest-bearing 
money, such as NOW accounts, when overdrafting is allowed at some penalty 
rate. It is sho-yn that the short-run interest elasticity of money demand is probably 
large (in absclute value) and negative, but in the long run this elasticity is much 
smaller or even positive. It is also argued that current definitions of the monetary 
aggregates, which exclude unused credit, may spuriously generate instability of 
money demazd. An alternative definition of money stock is suggested, and seems 
to be conceptally more satisfying. (JEL 311) 


The proposition tiat unused credit should 
ccunt as money goes back at least to Keynes, 
who wrote in 1930: 


There exists in umused overdraft facili- 
ties a form of Bank-Money of growing 
importance, of which we have no sta- 
tistical record.. the Cash Facilities, 
which are truly cash for the purposes 
of the Theory of the Value of Money, 
by no means correspond to the Bank 
Deposits which are published. The lat- 
ter...take no account of something 
which is a Cash Facility, in the fullest 
sense of the term, namely, unused 
overdraft facilities. 

[Keynes, 1930 pp. 42—43] 


Although many 2conomists, both before 
and after Keynes, have expressed similar 
views,! no explicit cerivation of the demand 
for money with overdrafts has been carried 
out. This paper is an attempt to remedy this 
omission. In addition, it generalizes previ- 
ous models by allowing some components of 
the money supply, such as NOW accounts, 
to bear interest. THis generates rich dynam- 
ics of the response of the money stock to 
caanges in interest rates. 


*Department of Eqnomics, Tel-Aviv University, 
69978, Tel-Aviv, Israel. My thanks to Alex Cukierman, 
Benjamin Eden, M. June Flanders, Meir Kohn, Alex 
Zanello, and two referzes for their very useful com- 
ments. Financial suppcrt from the Foerder Institute 
for Economic Research is gratefully acknowledged. 

Two of the numerous examples are Lavington 
(1921) and Laffer (1970. 
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According to the transactions theory of 
money demand,’ the optimal rule of money 
holding is a trigger-target rule; that is, the 
money stock is adjusted to the target only 
when it falls below the trigger. However, 
an assumption that is common to virtually 
all papers in the field is to constrain the 
trigger to, an exogenous value, which is usu- 
ally zero.* Optimization is then carried out 
on the target level only. In the solution 
presented here, both the target and the 
trigger are chosen optimally. This is accom- 
plished by using impulse control, a relatively 
new technique of optimal control.° 

A new definition of the money supply, 
which is closely related to the one offered 
by Keynes, is suggested. By including out- 
standing credit, the new definition seems 
conceptually more appropriate as a measure 
of the quantity of the medium of exchange. 
Moreover, it means that current definitions 
of the money stock, which assign zero weight 
to outstanding credit, may impart a consis- 


*The transactions theory of money demand origi- 
nated in a deterministic framework due to Baumol 
(1952) and Tobin (1956) and a stochastic version by 
Miller and Orr (1966). Some of the recent examples 
that extended these original works are Milbourne, 
Buckholtz, and Wasan (1983) and Romer (1986, 1987). 
The most general solution, and the one that is closest 
to this paper, is that of Frenkel and Jovanovic (1980). 

The reason for the optimality of this rule is a fixed 
transaction cost in converting bonds to money. 

Equating the trigger level to zero is implicitly 
equivalent to excluding the possibility of overdrafting. 

Foundations of impulse control can be found in 
Bensoussan and Lions (1982). 
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tent bias to the measures of money. This 


can shed some light on the difficulties in’ 


applying the transactions theory in order to 
measure money (for example, the “missing 
money” puzzle of Goldfeld [1976]) or the 
excess volatility of the velocity of money. 
The structure of the paper is as follows. 
Section I demonstrates some of the impor- 
tant insights in a simple, deterministic 
model. Section II presents the problem of 
optimal money holding with a stochastic 
disbursement process, while Section III de- 
scribes the solution. Section IV discusses 
some of the implications of this solution, 
and Section V elaborates on the conse- 
quences for the definition of money. Con- 
cluding comments are offered in Section VI. 


I. A Simple Deterministic Model 


Some of the basic.insights of the general 
case can be demonstrated in a deterministic 
framework with no discounting.- Consider 
the portfolio choice of an individual or a 
business firm. There are two assets: money, 
the medium of exchange, and another asset, 
called “bonds,” which cannot be used as a 
means of payment. Hence, people must hold 
money to complete their transactions even 
though they implicitly pay a liquidity pre- 
mium-for doing so, since money yields less 
interest than bonds. The amount of money 
held is determined by minimizing the ex- 
pected costs associated with holding money. 
These costs take the following form: 

(i) The cost of holding a money balance 
m is denoted by (m). When m > 0, the cost 
is the forgone interest on bonds relative to 
money; ‘when m <0, the cost is.a shortage 
cost that is the excess interest paid on over- 
drafts relative to bonds, or any other penalty 
paid on negative account balances. Assume 
that both the holding and penalty costs are 
linear to get i 


m formz0 
-pm form<0 


(G) I(m)= 


where r >0 is the difference between the 
interest rate on bonds and that on positive 
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money balances and p> Q is the cost per 
dollar of holding negative money balances. 

(ii) The cost of transfer of u dollars from 
bonds to money is denoted by C(u). These 
costs might include two terms: a fixed cost 
K per transfer, which is independent of the 
transaction size, and a proportional broker- 
age fee c per dollar. This gives. 


K+cu foru>0 


(2) ctu) LO for u=0 
with K, c> 0.6 
Consider now the straightforward exten- 
sion of the Baumol-Tobin analysis of money 
demand by allowing overdrafting at the 
penalty rate p, as in equation (1). Assuming 
a constant rate of expenditures, the money 
stock bounces between a lower level u and 
an upper level M with a sawtooth shape. At 
time t= 0, when the account balance is p, 
an amount of g dollars is paid tn bonds to 
be spent ‘during the period (which is of 
length 1). At t = 0, the consumer makes the 
first of n bond sales each of size (M — u). 
With no proportional cost [c= 0 in equa- 
tion (2)] and no discounting, the transac- 
tions cost is 


C=nkK, 


Assuming yu < 0, the holding cost is 


leale 


Minimization of the total cost (C + I) 
with respect to M, u, and n subject to the 


Implicit in equation (2) is the assumption that 
transfers from money to bonds (u < 0) are prohibited 
because of the very large cost of transfer in this direc- 
tion. The assumption is made for computational rea- 
sons by allowing for only one trigger point, from bonds 
to money, and excludes the upper point that might 
trigger a transfer from money to bonds. The implica- 
tions of this assumption are less important when the 
downward drift of the money stock is large relative to 
the standard deviation. See Frenkel and Jovanovic 
(1980 footnote 3). 
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constraint 
n(M-)=8 


yields the following solution: 


2gKp \'”" 
@ Mle] 
(4) H= P 


Optimizing again, but assuming p > 0, 
gives the internal solution M = u, which 
maximizes total cost. The solution of mini- 
mum cost in this case is the corner solution 
u = 0, which is inferior to the solution given 
in equations (3) and (4). I conclude that the 
levels M and u in equations G) and (4) 
minimize the total cost. 

What distinguishes the solution (3)—(4) 
from the Baumol-Tobin case is the use of 
overdrafts (u <0) in the model, for any 
finite value of interest rates r and p. Only 
when the penalty rate p of using the credit 
is infinitely high (relative to the interest rate 
r) do equations (3) and (4) reduce to the 
well-known Baumol-Tobin result of u =0 
and M=(2gK/r)'/*. Unless it is pro- 
hibitively expensive, individuals will utilize 
their available credit. 

The intuition of this result, which holds 
also in the general case of stochastic dis- 
bursements with discounting, is the follow- 
ing. As long as the money balance is posi- 
tive, the optimal policy is not to increase it 
but rather to let it fall at the rate of expen- 
ditures. This delaying of action reduces both 
transaction costs (since transactions are less 
frequent) and holding costs, since the aver- 
age amount of money held is lower. Simi- 
larly, when the money stock is zero, it pays 
to wait, at least a little while, before con- 
verting bonds to money. In addition to sav- 
ing on transactions cost, the penalty cost, 
proportional to the amount of credit used, 
is small for low levels of overdrafts. 

The robust conclusion of the transactions 
theory of the demand for money is that 
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when credit is available to firms or individu- 
als, even at a cost, they will frequently uti- 
lize it.’ It is interesting to compare this 
conclusion with that of Lucas and Stokey 
(1983). Their model, like mine, attempts to 
explain the use of both money and credit. In 
both models, money and credit are used to 
facilitate transactions, not as wealth. How- 
ever, in Lucas and Stokey’s model, money 
can purchase any good, while credit can be 
used to purchase only some goods (“credit 
goods”), but not others (“cash goods”). 

As a result, the two means of payment 
cannot be perfect substitutes in Lucas and 
Stokey’s framework. In fact, the degree of 
substitutability between money and credit 
depends on two exogenously given factors: 
the size of the subset of credit goods and 
the substitutability in consumption of credit 
and cash goods (which is a property of indi- 
viduals’ preferences for these goods). Hence, 
the relative use of credit and money de- 
pends not only on the relevant interest rates 
(r anc p in my notation), but also on two 
exogenously given factors. No matter how 
large the cost r of holding money 1s, the 
consumer cannot substitute credit for cash 
if he wishes to consume cash goods. 

Here, on the other hand, money and 
credit are inherently very close substitutes, 
since both are perfectly acceptable means of 
payment (in the Lucas and Stokey terminol- 
ogy, all goods are “credit goods”). The uti- 
lization of credit and money depends only 
on the relative costs. An increase. in the cost 
of holding money relative to credit (higher 
r/p ratio) results in substitution of credit 
for money, as given by equations (3) and (4). 
This sensitivity with respect to interest rates 
is preserved also in the general case, since 
the two means of payment provide a similar 
amount of liquidity. 

Another issue that will be discussed later 
and can be demonstrated by using the sim- 
ple model of this section is the appropriate 
definition of the money stock. There are at 


"The case for using credit is even stronger with 
discounting. In this case it is more profitable to post- 
pone the payment of the finite cost K, since the higher 
penalty cost is paid later. This constitutes an additional 
reason to reduce p. 
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least three possible definitions: 


| 
(5) E (m) =z (M+ H) 
(6) E;(m) = XM =p) 
1 
(7) E,(m) =>(M~ p). 


E (m) is the average account balance, 
which is positive only when r < p. E,(m) is 
the standard definition of money, the sum 
of. positive balances in checking accounts, 
which assigns zero weight to overdrafts. Ob- 
serving the liquidity of credit, whether used 
or not, E,(m) measures the amount of 
available means of payment at each point in 
time up to the.level x. The three definitions 
satisfy the inequality E,(m)> E (m) = 

E,(m), whereas the: equality holds in the 
Baumol-Tobin case. The claim that I make 
in Section V is that the more appropriate 
measure of the medium of SSanee is given 
by the broader aggregate E,(m). > 


IL Formulation of the Stochastic Problem 


At each point in time ¢t, when the money 
stock is m(t), the agent decides whether to 
convert bonds. to money. Suppose he de- 
cides on such a transfer of size u; dollars at 
time t;. This transfer costs C(u;), ‘defined in 
equation (2), and is carried out promptly to 
yield the money stock m(t;*) = m(t,)+ u;. 

_The money stock on hand is also ‘changed 
by the random net expenditure flow, accord- 
ing to the following ‘stochastic differential 
equation: 


(8) dm(t)=— gdt + odw(t) 


+E u(t —t,) 


izi 


Positive values of. mean disbursements g 
denote nét cash outflow. The stochastic part 
of the expenditures is described’ by the 
Wiener process w(t) with mean zero and 
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variance t.8 The last term in (8) denotes the 
discrete? increases of size u; of the money 
stock made at times t; where (t) is Dirac’s 
delta function.” 

The optimizing agent chooses a sequence 
of financial transfers u; made at-t; in order 
to minimize the expected discounted cost 
over an infinite horizon: 


(9) ee Es [Emedi 


Hist 





+) (K + cue] 


izl 


subject to the stochastic process described 
in equation (8). E, denotes the expectations 
operator given information known at time 
zero, and a is the interest rate on bonds. 
The cost of holding (positive .or negative) 
money stock m(t) is accumulated continu- 
ously at a rate J(m(z)) given in equation (1). 
The transfer cost K + cu is incurred dis- 
cretely. 


®Frenkel and Jovanovic (1980) identify g as repre- 
senting the transactions motive for holding money, 
while o? stands for the precautionary motive. Miller 
and Orr (1966), on the other hand, interpret e? loosely 
as a transactions term (p. 425). I think the latter 
interpretation is more appropriate, because there is no 
room for precautionary. motives in this framework. 
Since the analysis is.done in continuous time and the 
disbursement flow is finite, there is zero probability 
that money holding will overshoot the thresholds. In 
this case, even when o” is large, the agent can control 
his money holding by choosing the right thresholds 


_ without worrying about holding money as a precaution 


against an unexpectedly low stock. In order to study 
the precautionarv motive, the analysis should be done 
in discrete time, when there might be’a finite probabil- 
ity of overshooting the trigger levels. 

Financial transactions will be made infrequently 
because of the fixed cost K > 0 that accompanies any 
transaction. In this case, a continuous transfer during 
any finite period of time results in an infinitely high 
cost. . 

The delta function is defined by 


ifa<c<b 
otherwise 


JIE- 0) de a 


for any continuous function f(x). 
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A common assumption made in studies of 
money demand is that the optimal transfer 
policy {u,,t,;} takes tae form of simple trig- 
ger-target rules. The existence of such a 
rule for the problem (8)—(9) was established 
by Constantinides ard Richard (1978). They 
proved that the optimal policy is of the 
(S,s) type studied ia the inventories litera- 
ture:.when the money stock.is below the 
trigger point s, a sal> of bonds will be made 
such that the quartity of money will in- 
crease to the targe- level S; otherwise no 
financial transactior will be made. I can 
now proceed to the =valuation of these trig- 
ger and target levels, denoted by u and M, 
respectively, from now on. 


. HI. Solution 
The optimizatior problem described in 


the former section generalizes previous work 
on money demand i2 several ways. The most 


important is the corsideration of overdrafts. - 


This option has been implicitly excluded in 
other work; instead. the trigger level uw was 
assumed to have = certain value (usually 


zero), and the solvng procedure has been ' 


to find the target VW, given the exogenous 
value of p. 

Allowing overdrefting at a finite penalty 
rate makes the triger, a control variable 
that is chosen optimally. Hence the solution 
to the optimizatior problem requires find- 
ing both u and +f. This can be accom- 
plished by using ar optimal control. theory, 


the “impulse control,” which analyzes opti- © 


mal behavior in continuous time with fixed 
cost of taking an astion.’! This type of con- 
trol characterizes the behavior of.the cash 
manager in- the presence of a fixed transac- 
tions cost. Sulem (7986) used this apparatus 
to solve the optimization problem (8)—(9) as 
follows. At each point in time t, when the 
money stock is at 1 level m, the cash man- 
ager can either sell bonds immediately to 
increase his money stock or postpone his 
transaction at least to time ¢+7. In the 


H Constantinides anc Richard (1978) have used this 
theory to prove the oprimality of (§,5) as the money 
rule for the problem (8—(9). 
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latter case 
(10) V(m) < f N m(x))e e dx 
t 


+V(m(t ie r))e7%". 


The first term on the right-hand.side of 
equation (10) is the cost of money holding 
between ¢ and ź + 7. From Bellman’s princi- 
ple of optimality, the second term in (10) is 
the minimum expected cost from time t+ r 
on. Expanding equation (10) as a Taylor 
series around time t, where m(t+7)= 
m — gt + odw(t).from equation (8) and let- 
ting +->0 yield the following differential 
equation: 


1 dV dV 
(141) --0?°-—— +g— 
dm 


5° G2 +aV <I. 


When bonds are sold at time ft, the money 
stock increases immediately to level m + u. 
Since the transaction size u is chosen opti- 
mally, then, 


(12) V(m)<K+ min ( cu +V(m+u)). 
7 ;, li > 


Since either equation (11) or (12) must hold 
as an equality, V(m) is the solution of the 
following set of equations: 


(13) AV <I 
(14) V< BV 
(5) (AV—1)(V- BV) =0 
where | 
AV(m) =- a + id +aV 
“2 dm dm 


BV(m)=K+ min (cu + V(m + u)). 
uz YU, 


The properties of mean zero and variance ¢ of the 
Wiener process w(t) are also used in the derivation of 
equation (11). The function J is defined in equation 
(1). An equation similar to (11) is derived in Dixit 
(1989). 
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The system in (13)—(15), called a quasi-vari- 
ational inequality in the impulse control lit- 
erature, allows solving for the expected cost 
V as a function of the money stock m and 
for the trigger and target levels. The solu- 
tion is stated in the following theorem. 


THEOREM: The optimal levels of the target 
M and the trigger u are given by the following 
equations: 


(164) M=(act Nac- Pe 


| 
+ 
2 





Je -1)-ax]| 
(17) eMM (ac +7)"|(ae~ per 


+| ptr (a eA2~ADE a À ) 
ÀT 


where the parameters A, and A, are defined 
as 





(18) A,= o?| —(g? +2a07)" + g| 


<0 
(19) A,=07?](g?+2a07)”*+ g] 


>0. 


PROOF: 
See Appendix 1. 


IV. Analysis of the Solution 
A. Discussion 


The first property of equations (16) and 
(17) worth mentioning is that the solution 
for M and u is homogeneous of degree one 
in the vector (g,ø, K). Hence money de- 
mand is demand for real balances. 

The framework for analyzing money de- 
mand described in the preceding two sec- 
tions generalizes provon research in sirge 
respects. 
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` (i) Overdrafting is allowed at some finite 
penalty rate, p. 

Gii) The cost of holding money, z, is not 
necessarily equal to the discount rate, a. 

(iii) The proportional cost of transferring 
bonds to money, c, is not restricted to zero. 

One of the important properties of the 
solution (16)-(17), as demonstrated in Ap- 
pendix 1, is u <0. The prevailing assump- 
tion that constrains the trigger level to zero 
is therefore literally correct only when the 
cost of credit is infinitely high. Thus, u <0 
is an unambiguous prediction of the trans- 
actions theory of the demand for money, a 
prediction that holds in the. general stochas- 
tic case and not only in the simple model of 
Section I; economic agents do use their 
credit provision when it is not excessively 
expensive. @ 

The assumption of the equality between r 
and a, which is also made very often in the 
literature, is a very limiting one. If the cost 
of holding money, r, is assumed to equal the 
bond market rate, a, it must be the case 
that the interest paid on money is zero. This 
might have been a good assumption when 
the interest: paid on demand deposits was 
legally so constrained. In the wake of the 
deregulation of the banking industry, the 
assumption r=a restricts the analysis to 
the demand‘for currency, which is not what 
the transaction theory of money demand is 
about. Since a large fraction of the mone- 
tary aggregates bear positive interest rates, 
allowing for r<q@ is’ crucial for analyzing 
the demand for money. ` 

In general, there. is a dichotomy in the 
literature between models with fixed cost K 
only, as in the. Baumol-Tobin or Miller and 
Orr (1966) models, and analyses of propor- 
tional cost with ‘no fixed cost, as in Eppen 
and Fama (1969). However, the conse- 
quences of the inclusion of both sources of 


‘SNumerical solutions for u and M, some of them 
shown below (for example, Figs..1.and 2), show that for 
a wide range of parameters the convergence of u to 
zero when p increases is fairly slow. 

Empirical support of a widespread use of over- 
drafts and trade credit appears in Kanniainen (1978), 
Laffer (1970), and others. 
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FIGURE 2. PLOT oF TRIGGER AND TARGET Points vs. HoLpinc RATE 


costs in the model are not as crucial as the 
other two general-zations mentioned previ- 
ously. When the two kinds of costs are 
present, the important one is the fixed cost; 
then the optimal money holding rule is the 
trigger-target rule that is typical of models 
with lumpy transa-<tions cost. 


B. Very High Overdraft Rate 


The most general analysis of money 
demand when net disbursements include 
both deterministic and stochastic elements 
is that of Frenkel’ and Jovanovic (1980). 
Their model is a special case of mine with 
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no proportional cost and with one interest 
rate instead of three; that is, c=0, p >o, 
and r = a. Assuming c= 0 and p >œ, I can 
concentrate on the importance of the gener- 
alization r # a. In this case, equations (16) 
and (17) yield the following approximate 
solutions for M and p:' 


Ka p 
(20) M =p- Fa + TAn 


(21) 


P 
"a i = 1 a Ail FY TMAH“. 


Equations (20) and (21) give the following 
solution for M; . 


Kaa, 
(22) 


eM ~1—-A,M— 





Following Frenkel and Jovanovic, I ex- 
pand equation (22) in Maclaurin series and 
ignore terms of third and higher order. The 
result is’ 


2 aye 
(23) =| 


“=| Àr 


Substituting for A,, then, 
1/2 
2Kac’ 


24) M = | OOF OT 
Oe aa 


and the approximate solution for u is 


Ka i 1/2 
(25) px-| 4 —+| 30. 


—— 


r 


Equations (20) and (21) are derived by expanding 
equations (16) and (17), respectively, in Maclaurin se- 
ries and ignoring terms of third and higher order for 
those expressions that multiply p /r and of second and 
higher order for those expressions that do not multiply 
p /? r (which, by assumption, is infinitely high). 

‘The approximation used in deriving equation (23) 
might be less accurate for finite p than that used in 
equations (20) and (21) because u, but not M, ap- 
proaches zero when p/r >. 
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Assuming that r=a, equation (24) re- 
duces to the central result, the money de- 
mand equation, of Frenkel and Jovanovic.’ 
Equation (24) can serve as a very conve- 
nient instrument in examining the signifi- 
cance of constraining the two interest rates 
to be equal. In fact, one can define at least 
four different elasticities which are of inter- 
est. 

(a) Interest elasticity with r=a. In this 
case, I assume that no interest is paid on 
money; any change in the market interest 
rate results in an identical change in the 
cost of holding money. The empirical rele- 
vance of this case is to the demand for 
currency, M,. The interest elasticity in this 
case is the interest elasticity analyzed by 
Frenkel and Jovanovic, denoted by (Mp, 1r), 
and is derived from equation (24) to give 


(25) (Mayr) = (= rdu +270” 


-- (p? +2ro)'| -< 


which satisfies ~1/2 <(M),r) <0. 

(b) Elasticity with respect to the market 
interest rate, a, given a constant holding 
rate, r. In this case, the interest paid on 
interest-bearing money, such as NOW ac- 
counts, varies with the market interest rate 
point by point. For instance, when a=7 
percent, the NOW rate is 5 percent, and 
when a increases to 7.5 percent, the NOW 
rate increases to 5.5 percent so that r=2 
percent without change. In this case 


. 
(27) n®(M,a)= ei n(M,r) 


which is nonnegative. Hence, when the cost 
of holding money is constant, an increase in 
the market interest rate leads to an increase 
in money demand. The intuition behind this 


17 Frenkel and Jovanovic call equation (24) with r = a 
“the optimal money holding.” Although M is, in gen- 
eral, different from average money holding, the justifi- 
cation for this statement in their model is that M is the 
only control variable. 
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apparently surprising result is simple: an 
increase in the nominal interest rate with no 
change in the nominal rate on money hold- 
ing is in fact a decrease in the effective cost 
of holding money.!® 

(c) Elasticity with respect to the interest 
cost of money, r, given a constant market 
rate, a. In this case, the interest paid on the 
interest-bearing money changes without a 
change in the market rate. For example, a 
fall in the interest paid on NOW accounts, 
with no other change, results in a higher r 
with the same a. The interest elasticity is 
now 


1 
(28) n(M,r) =~ > 


which is the well-known prediction of the 
square-root rule of the Baumol-Tobin 
model. This is an interesting result. A well- 
established fact in the theory of money de- 
mand is that the crucial assumption in the 
Baumol-Tobin model is that of determinis- 
tic disbursements; that is, u fo œ. What 
I find here is that their result is robust to 
a stochastic generalization as long as the 
rate a does not change. The reason for this 
is the following. What characterizes the 
Baumol-Tobin analysis is not only the deter- 
ministic nature of their model but also the 
“steady-state” assumption, which in fact 
means no discounting. Hence, I conclude 
that the assumption of constant discount 
rate is pivotal to the Baumol-Tobin model, 
a conclusion that could not be derived with- 
out the distinction between r and a. 

(d) Elasticity with respect to the market 
interest rate, a, given a constant rate on the 
interest-bearing money, i. This case seems 
to be empirically plausible for short-run 
analysis since the interest paid on checking 
accounts, 7, is much less volatile than com- 
petitive market rates.!? Substituting r = a — 


‘8-The possibility of positive interest elasticities arises 
also in Romer’s (1986) model. 
The rigidity of i can arise from institutional rigidi- 
ties and from the “menu costs” of administrating an 
interest rate change, 
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i, where i is fixed, in equation (24) I get 


(29) n®(M,@) = zp + (Mor) 


—i 

(a —1) 
which reduces to case (a) when i= 0. No- 
tice, however, that the interest elasticity can 
now be much larger than one-half in abso- 
lute value if i is close to a. For example, 
when i= 5.25 percent and a=7 percent, 
then 7@(M,a)=—15+7(M,,r), which 
falls in the (—2,—1.5) region. The reason 
for this high short-run interest elasticity is 
that a percentage rise in the market rate is 
translated into a much larger increase in the 
cost of holding money. 

The analysis of interest rate elasticities 
can be summarized as follows. By retaining 
the assumptions of no proportional cost (c 
= ()) and a prohibitively expensive overdraft 
rate (pœ), one can study the effect of 
relaxing the assumption of zero interest rate 
paid on money. The implications of this 
generalization seem to be fairly important. 
One can now distinguish between different 
monetary aggregates in their response to 
interest rate changes. The interest elasticity 
of currency demand [case (a)], which is the 
case that is usually investigated in the litera- 
ture, is negative and larger than —1/2. 
However, the analysis of interest elasticity 
of interest-bearing assets is much richer and 
depends on the way in which interest rates 
vary. In the short run, when the interest 
paid on checking accounts is fixed, one ex- 
pects to see a large drop in the demand for 
these accounts when the market rate in- 
creases [case (d)]. However, in the long run, 
when the interest paid on NOW and similar 
accounts adjusts to the new, higher interest 
rates, the drop in the money demand will be 
milder, and one might even see a rise in 
demand [case (b)]. Case (c) studies circum- 
stances in which the interest paid on check- 
ing accounts is more volatile than market 
rates. This might have been the case when 
the legal restrictions on payment of interest 
on checking accounts were removed in re- 
cent years to produce an abrupt drop in the 
cost of money holding, not fully accompa- 
nied by a similar drop in other interest 
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rates. In this case, my model predicts a 
surprising interest rate elasticity of —1/2, 
the Baumol-Tobin result, even in a stochas- 
tic framework. 


C. Deterministic Disbursements 


When net disbursements are determinis- 
tic (07=0, g>0), the quasi-variational 
inequality (13)-(15) leads to a first-order 
instead of a second-order differential equa- 
tion for Vn), m= yw. The straightforward 
solution for M and u, assuming c= 0 for 
simplicity, is given by the following equa- 
tions:*° 


(30) 
(31) 


When the overdraft rate p satisfies p — œ, 
the solution for M and u becomes the 
Baumol-Tobin result: 


rM + pp=—-ak 


re®™/8 4. peth/8 =r +p. 





2gK 1/2 
(32) m-(——] 
r 
2gkr 1/2 aK 
(33) gs | ey, 
p P 





Assume now that disbursements are de- 
terministic and agents minimize expected 
cost per unit of time. This assumption of 
zero discount rate («œ= 0), the so-called 
“steady-state approach,” results in equa- 
tions (3) and (4), Section I, as the solution 
for M and pu.” Once again, the Baumol- 
Tobin result is obtained when p > ©, 


See Sulem (1986). Notice that when disbursements 
are deterministic, the solution can be derived by using 
simple calculus, and it is not necessary to resort to 
impulse control methods. 

ri Ignoring terms of third and higher order in the 
Maclaurin series of e*”/% and second and higher 
order for e?¥/£, since M is much larger than (the 
absolute value of) uw in this case. 

Notice that equations (3) and (4) cannot be de- 
rived directly from equations (30) and (31) by substitut- 
ing a=0. This is because the latter case should be 
solved differently from the a > 0 case, as demonstrated 
by Bather (1966) and Sulem (1986). 
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D. Numerical Solutions 


Comparative static analysis of the general 
solution for M and pu, given in equations 
(16) and (17), is straightforward but tedious. 
Most of the elasticities have, in general, 
ambiguous signs.” It is perhaps more illu- 
minating to present numerical solutions for 
M and u as a function of different parame- 
ters.4 I concentrate on the effects of 
changes in the parameters p, r, and a, the 
most important variables that distinguish 
this analysis from previous ones. Figure 1 
depicts the change of the target level M 
and the trigger u versus the overdraft rate p 
for discount rate œ =7 percent, cost of 
holding money r = 2 percent (corresponding 
to 5 percent interest paid on checking ac- 
counts), c=0, g=a=1, and K=0.01. 
Both M and yu increase with p at similar 
rates so that the difference (M — u) is not 
very sensitive to p when p is not very 
small.” For example, when p increases from 
20 percent to 40 percent, M — u decreases 
from 1.141 to 1.094, which gives an arc 
elasticity of —0:04. I shall make use of this 
observation later. Notice also that the con- 
vergence toward the limiting values of M 
[equation (24)] and u (zero) for p>, the 
values that are closely related to the 
Frenkel-Jovanovic analysis, is fairly slow. 
When p= 50 percent, then M = 0.865 and 
u = — 0.219; even when p rises to the outra- 
geous rate of 200 percent, M = 0.936 and 
u = —0.105, compared to the values M = 
1.017 and = 0, which correspond to the 


An excepticn is the effect of an increase in the 
fixed cost on the target M. When c= 0, then, 


dM fa eae 
roa es eer 


and the corresponding elasticity varies with different 
parameters and is not fixed at the level 1/2. 

The applicability of this method is not as narrow 
as it sounds because of the homogeneity of M and u 
as a function of (g,0, K). One can thus interpret the 
numbers for M, p, g, o, and K as representing, say, 
thousands of dollars. 

For very small values of p, the trigger u increases 
very rapidly. This is because u ——œ when p- 0, 
since it is optimal never to sell bonds in this case. 
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p—® approximation. Hence, the general- 
ization to finite values of the overdraft rate 
p does make a difference even if this rate is 
high. 

Figure 2 presents the effect of the rate of 
money holding r on the trigger and target 
levels when the parameters are p = 20 per- 
cent, a=7 percent, g =o =1, K = 0.1, and 
c =0. As expected, both M and u fall when 
r rises. However, the elasticity of any of 
these two variables with respect to the hold- 
ing rate r is, in general, less than one-half 
(in absolute value), unlike the case when 
the overdraft rate p—. Notice that when 
r—>Q, then M>% and >Q and the 
money balance is always positive. On the 
other hand, when r becomes large relative 
to p, the target M is negative, resulting in 
negative money balances held at all times.” 

The target and trigger levels are not very 
sensitive to changes in the market interest 
rate a. For a wide range of parameters, M 
and w are practically constant even when a 
changes from 1 percent to 17.5 percent. 
However, it is interesting to note that the 
unexpected result of a rise in the money 
demand when the interest rate æ increases, 
found in the approximation for large p 
[equation (27)], still holds when p is finite: 
M increases slightly and u decreases slightly 
when a rises, such that the difference M — u 
increases, although by a very smal! amount. 

The effects on M and u of changing the 
parameters of the Wiener process and the 
cost function are as follows. An increase in 
the mean:variance ratio g/o* raises: both 
M and pu, but M — p can rise or fall. An 
increase in the fixed cost K, by contrast, 
raises the target M and lowers the trigger u 
by large amounts. Thus, the difference M — 
a is quite sensitive to changes in the fixed 
cost. Variations in the proportional cost c 
have a much milder effect: large increases 
in c produce slight decreases in M and p. 
This numerical analysis thus extends exist- 
ing comparative-analysis results that are de- 
rived by approximate solutions.” 


Zó Tn Figure 2, I allow r > a, which implies negative 
interest on demand deposits. 

?TSee, for example, Hadley and Whitin (1963) or 
Blinder (1981), 


BAR-ILAN: OVERDRAFTS AND THE DEMAND FOR MONEY 1211 


V. Aggregate Money Demand 


One of the results of my model, m <0, 
implies that individuals and firms frequently 
use their available credit. As discussed for 
the deterministic case in Section I, this sug- 
gests at least three different measures of the 
money stock. 

(i) Weighted average of the money stock, 
when both positive and negative balances 
are weighted by their relative frequency. 
This is probably what individuals or firms 
perceive as their average money holding. If 
one denotes by (m, t) the probability den- 
sity function of having a money balance m 
at time ¢,”8 then average holding, denoted 
by E On), will be 


(34) E,(m) = f mọ(m,t) dm. 


(ii) Average positive money holding. This 
definition is the closest to the current defi- 
nition of money and is given by 


(35) E,(m) = f “mo(m,t) dm. 


(iii) Average money stock measured by 
its distance from the trigger level u. This 
redefinition of “zero level” of money stock 
yields 


(36) E,(m) = J (m= 4) 6(m, 1) dm 


= E,(m)~- pz. 


As long as u <0, as will always be the case 
in my framework, the relationship among 
the three measures will be 


(37) E3(m) 2 E,(m) = E,(m) 


where the equalities hold for u = 0 (p >), 
the standard case in the literature. 


86(m,t) is a function of the initial money stock. I 
omit this as an explicit parameter for simplicity of 
exposition. Similarly, the time subscript is omitted in 
Em). 
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The Steady-state c distribution of the money 
stock is defined as”? 


(38) (m) = lim.d(m,1), 


The derivation of Cm) for the one-target- 
one-threshold (u, M} policy is DEN in 
Appendix 2.°° The result is ` 


(39) $(m) 


(Mu eera] 
~~ foru<m<M 


(M = ju) fermen — emren] 


for M <m 
where 
~ 28 
TE 
Using (39) one gets 


. 1 o? 
(40) Ey(m)= [M+ w+ — 





(41) E,(m)=(M-p)" 


M 
x| -—— + — +r ?(1- mr) 
T 


2 





1 o? 
(42) Es(m)=5 da 


which reduce to the. equivalent measures 
(5)-(7) for the deterministic case. 


The steady-state distribution is not a function of 
the initial money stock. 

"Actually, Appendix 2 presents the derivation for 
the more general (11, M, u2) rule where u, is a sec- 
ond, upper threshold that triggers a reduction in the 

money stock to the target M. Only. then is the condi- 
_ tion p, >œ applied. Notice also that when p= 0, 
equation (39) reduces to equation (24) ‘in the Frenkel 
and Jovanovic (1980) paper. 
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It is interesting to compare the standard 
definition cf the money supply, Em), 
which ignores credit,. to the’ definition 
Em). The latter, which defines unutilized 
credit as money, is probably the measure 
proposed by Keynes (1930) and-.used by 
Laffer (1970) for the U.S. economy. There 
are many aspects to the question of the 
appropriate monetary aggregate. For exam- 
ple, it is crucial to understand the process 
generating’ the monetary base, which is 
tightly controlled by ‘the Fed, for its effect 
on the dynamics !of the price level. How- 
ever, if money is defined as the medium of 
exchange, our analysis suggests that Em) 
is probably more appropriate. 

The liquidity of*a particular financial in- 
strument is measüred by the pecuniary and 
nonpecuniary costs of transferring it to cash 
or demand deposits. In this respect, credit 
should be included in the same monetary 
aggregate as currency - -and checking ac- 
counts; that is, M,. Having $500 in a check- 
ing account and a credit line of $1,000 yields 
the same amount of means of payment as 
$1,500 in one’s account, with no credit. The 
transition from a positive to a negative bal- 
ance in an account with overdrafting provi- 
sion is completely smooth and does not 
involve any cost. My analysis gives this pre- 
diction a somewhat more solid theoretical 
basis. Demarid deposits and credit are very 
close substitutes; depending on the relative 
costs, optimizing agents replace one with 
the other with no significant change in their 
purchasing power: For example, easier 
availability of credit, represented by lower p 
in our model, induces almost perfect substi- 
tution of credit for demand deposits (M 
and u decrease: by very similar amounts). 
This is also the conclusion if one uses E,(m) 
as the aggregate: money supply, but not 
E,(m). This effect seems to be analogous to 
the rise of bank demand deposits leading to 
a.reduction in the demand for currency. 

This is a potentially useful insight for the 
puzzle of the missing money (Goldfeld, 
1976). The official money stock, E,(m), 
which treats the trigger level » as fixed at 
zero level, is erroneously perceived to fall 
with extension of credit. The difference be- 
tween E,(m) and E,(m) tends to be quite 
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significant. For instance, when a =7 per- 
cent, r=2 percent, p= 20 percent, w=o 
=1, K = 0.01 and c = 0, one gets M = 0.780 
and u = —0.362 to give E,(m)= 0.721, but 
E,(m) = 1.071. A similar percentage of dis- 
crepancy arises for a wide range of parame- 
ters. 

The model also suggests a similar expla- 
nation to the’ seemingly excess variability of 
the velocity of money. Availability of credit 
will lower both the trigger and the target 
M by similar: amounts without changing the 
velocity of money. If the quantity of money 
is measured by £(m), this will be the con- 
clusion. However, when the standard defi- 
nition E,(m) is used, the velccity seems to 
rise. 


To conclude; my model suggests that, 


proper definition of the medium of ex- 
change has to include some measure of ap- 
proved credit lines available to the public. 
The exact nature of this measure depends 
on the specific way in which’ credit is ex- 


tended. In some countries, for example’ 


Britain and Israel, banks set a limit on the 
overdraft facilities of firms and individuals; 
and these should be included in the money 
supply data. Apparently, the overdraft rate 
within the limit is relatively low, while the 
penalty for overdrafting above the. limit is 
very high; in this case, the trigger level u 
chosen by customers is probably identical to 
the limit set by the bank, and the inclusion 
of unutilized overdraft facilities in the 
money stock corresponds to the E,(m) def- 
inition. Similarly, the terms of trade credit 
are -often set such that u coincides with the 
credit limit. Alternatively, additional data, 
probably by sampling the credit history of 
firms ‘and individuals, can help determine 
the chosen credit liniits2! Some indirect 
measure of £,(m) can be cbtained from 
data on the standard money supply E,(m). 
For example, for the deterministic case we 
get [from equations (4), (6), and (7)] 


(3) Em) =(—* P) Eam). 


*'The Bank of Israel uses a sample of 400 compa- 
nies to study the liquidity position. 
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VI. Concluding Remarks 


The current work can be extended in 
several ways. One way would be to allow 
two overdraft rates, a low one within a limit 
and a higher rate for overshooting. Another 
possibility is to allow for a second, higher 
trigger point that induces a reduction of the 
money stock when it is “too high,” as in 
Miller and Orr (1966). A different angle of 
this problem is the study of monetary policy 
with credit, whether utilized or not. My 
model suggests that either credit or mone- 
tary control can affect the price level, but 
when applied separately, their effectiveness 
is limited by the substitutability of. credit 
and demand deposits. 


APPENDIX 1 


This appendix presents the solution to the quasi- 
variational inequality (13)-(15). Since the trigger-target 
(S,5) rule is the optimal policy for this problem, V(m) 
can be defined over two regimes.:When the money. 
stock is below the trigger point y, it is increased to the 
target level M, and ‘equation (14) holds as an equality: 


(Al) V(m)=Ktc(M-—m)+V(M) form<pz. 


When m >u, no financial transaction is made, and 
equation (13) holds as an equality. In this case, the 
solution of the linear differential equation (13) is 


A2) V a 

(A2) ¥(m)=~—(=—m] 
+ Die" + Dye™™ formz0 
P(8 

A3) V(m)=—|—- 

(A3) ¥(m) == (5 —m] 


+ Eje” 4+ E,e22" for m<0. 


The first term in (A2) and (A3) is a particular 
solution of the nonhomogeneous part of equation (13), 
and the last two terms are the general solution to the 
homogeneous part. D; and E; (i = 1,2) are constants to 
be determined, and the roots of the characteristic 
equation, A, and A, are given by 


(A4) dy=o7?| -(g? +200?) + | <0 
(A5) A,=07*[(g?+2a07)'/7+ g| 20. 


Assume initially that 4 <0. In this case, equation 
(A3) describes the expected cost V(m) for u <m<0. 
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Complete characterization of the solution requires 
finding the values of D,, D2, E,, E2, H, and M. These 
six parameters are solved using the following six condi- 
tions. 

(a) ‘Continuity at m= 0: 


seek Org. . WE. | 
(A6) ——3 FD, + D= — tE, +t Ep. 
1 3 a . ' a. : 
(b) Continuous derivative at m= 0: 


A r ý 
(A7) — + ADi + ÀD2 
a’. 


$ : 
= fi + A, Ey + Às E2. 
(c) Continuity at m =p: 
P{(§ 
(A8) T & Gaa n) + E e^F + E,e*2# 


=K+e(M-pu)+V(M). 


(d) Continuous derivative at m= p: 


-t p D 
(A9) | ->+ A Behe. + Az Ene = =—¢.- 
a 


at 1 
4 


(e) M is the optimal target. Optimizing over M in 
` equation (A1) yields 


V(M)=-~c. 2 


(A10) 
O Vim) grows linearly at a rate 7 af a when mo: 
(A11) lim | V'(m) = a 
which gives immediately 

(A12) D,=0. 


The rest of the parameters are 
1 p 
(A13) D= a F u c) 
Ay a 
1 ptr Ay 
satri 
(Ay—Ag)\ a M l 


a@ 


1 p 
(A14) E= ee ~- c) 
. 1 





: (7 earn 


(à; — Az) 


A15 E <A (ptr 
eon “Tao a ) 
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The solution for the target. and the trigger u is 
giveri by the following equations: 


(A16) M =(ac+ ry} 





x fae ie á — [G zij ax! 


(A17) eTM = (ae + | (ee — pje~*# 





p+r 
ese JC EART AHE Wd a) 


It is straightforward to see’ that the analogue of 
equations (A6)-(A11) for the case u >0 yields the 
solution u = M, which would lead to infinite expected 
costs and which could, therefore, not be optimal. I thus 
conclude that equations (A16) and (A17) determine 
the optimal levels of the target money level M and the 
trigger point u and that u satisfies p <0. 


APPENDIX 2 


_In this appendix I derive equation (39) and present 
the steady-state distribution of a Brownian motion with 
a drift when there are two thresholds, p; and #3, and, 
one target, M. The derivation is based on Chapter 15 
of Karlin and Taylor (1981). They show (section 8, E) 
that-(4,,M,p) diffusion process with mean g and 
variance g? converges to the following limiting distri- 
bution: l 


(A18) ¢(m)= lim $(m,t) 
< =G(M,m) wjf; f Gq, ne 


where GU, y) is the Green function of the dasiga 
process defined ty 


(A19) G(x, y) 


a TOIRE Stal 
for y S X<Y SH 
o Seal 


for w SY SX SH 


4 


and where S(-) and s(-), for a Brownian motion with a 
drift, can be expressed as follows (Karlin and Taylor, 
p. 205): 


(A20) s(x) = exp( -2s /0?) 


S(x)= As(x)+B (A and B are constants). 
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The integration required in the denominator of equa- 
tion (A18) can be performed straightforwardly to get 


a2) f "GCM, y) dy 
an Hr x 


= ["G(M,y)dy + | GM, yey 
Hy M 


= 2AN Eslu) — slu o s (MY 
where N is defined by 


(A22) N= (p~ u2)s(p1)s(u2) 


+s(M)[s(u)(M ~ 11) + 5(42) (2 ~ x)]. 


The division of equation (A19) by (A21) yields the 
following solution for the steady-state distribution: 


{A23) (m) 
(1/N)[s(M)— s(u2)][s(uy}- s(m)] 


` forp,<msM 


(1/N)[s(u1)—- s(M)][s(m) — s(u2)] 


for M<m< pp. 


= 
= 


Since I am interested in the case of no upper bound- 
ary, w >, I get from equation (A22) 


(A24) Pta =s(u,)S(M)(M-u:) 


and from equation (A23) 


(A25) lim _4(m) 
s(uı)~ s(m) 
s(u,)(M — 1) 
forp,;smsM 


s(m)[s(#1)— s(M)] 
s(u,)s(M)(M ~ mı) 
for /4<m 


= 


which is equation (39) in Section V. 
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Tax Smoothing with Financial Instruments 


By HENNING Boun* 


The paper analyzes the optimal structure of government debt in a stochastic 
environment. In a model with distortionary taxes, the government should smooth 
tax rates over states of nature as well as over time. Government liabilities should 
be structured to hedge against macroeconomic shocks that affect the government 
budget. The optimal structure of government liabilities generally includes some 
“risky” securities which are state-contingent in real terms. The empirical part of 
the paper tests for tax smoothing and then studies state contingencies imple- 
mented by some specific securities including nominal debt, long-term bonds, 
equity, and foreign-currency debt. (JEL 321) 


The United States government issues 
Treasury bonds and bills of various maturi- 


ties. They are considered risk-free in terms . 


of default risk, though their real value may 
fluctuate considerably. This paper is con- 
cerned with questions of what may motivate 
such a debt structure and whether it is an 
optimal one. 

The paper analyzes government policies 
that maximize welfare. The welfare-maxi- 
mizing approach of analyzing government 
debt policy was introduced by Robert Barro 
(1979). He shows that, in a deterministic 
environment, optimal tax and debt policy 
should smooth tax rates over time. Optimal 
policy in a stochastic environment calls for 
state-contingent tax rates that must be sup- 
ported by state-contingent debt, as Robert 
Lucas and Nancy Stokey (1983) have shown. 
I consider a stochastic version of Barro’s 
model. Key assumptions (discussed in more 
detail below) are that welfare losses due to 
distortionary taxation can be summarized by 
a convex function of tax rates and that asset 
prices and return distributions are exoge- 
nous. 

Optimal policy will then smooth tax rates 
over time and over states of nature. If there 
are macroeconomic shocks that affect the 


*Department of Finance, the Wharton School, Uni- 
versity of Pennsylvania, Philadelphia, PA 19104-6367, I 
thank all the members of the macro-lunch group at the 
University of Pennsylvania and three anonymous refer- 
ees for many valuable comments. 
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government budget, government liabilities 
should provide a hedge against these shocks. 
This characterizes the optimal structure of 
government liabilities. 

If markets are complete and if the gov- 
ernment can trade on all markets, the model 
has the very strong implication that changes 
in tax rates should never occur, because 
government liabilities could be contingent 
on all conceivable shocks. However, mar- 
kets may be incomplete for various informa- 
tional or incentive reasons, and even if they 
were complete, the government may not be 
able to operate on all markets.’ Definitive 
answers on why certain markets are missing 


‘See Franklin Allen and Douglas Gale (1988) on 
market incompleteness and Gale (1990) on the possibil- 
ity that innovative debt management may open new 
markets. Here, the critical issue is not missing markets 
per se, but government access to markets. Government 
activity on markets designed to provide hedges against 
budgetary uncertainty may be particularly problematic, 
because of incentive and asymmetric-information prob- 
lems. For example, the government might be able to 
reduce the volatility of taxes by issuing securities con- 
tingent on government spending or on the tax rate 
itself. However, if such securities were issued, the 
government would have an incentive to manipulate 
their payoffs by changing spending or taxes. One can- 
not claim, though, that incentive problems provide a 
full explanation for imperfect tax smoothing, since 
nominal debt is traded even though it clearly creates a 
time-consistency problem. Gale’s point that innovative 
debt management may improve welfare is very much 
consistent with this paper (see Section IV-C), but to be 
cautious, I will only consider currently existing securi- 
ties. 
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and on whether or not the government may 
trade on a certain market are beyond the 
scope of this paper. Nonetheless, unless one 
can rule out any incompleteness or incen- 
tive problems, it seems too much to require 
that the government stabilize tax rates per- 
fectly. Therefore, I take the set of available 
securities (bundles of state-contingent 
claims) considered for the government port- 
folio as exogenous. 

For any given set of securities, one can 
compute the optimal portfolio of govern- 
ment liabilities formed with these securities. 
The optimality conditions are that the tax 
rate is uncorrelated with the return on each 
security. For several well-known and widely 
traded securities (short- and long-term dol- 
lar bonds, stocks, foreign exchange, and 
foreign bonds), I test the zero-correlation 
condition and estimate optimal portfolios, 
making different assumptions about the set 
of available securities. The tests provide an 
assessment of tax smoothing as a positive 
theory of government. The estimated opti- 
mal portfolios in comparison with the actual 
structure of U.S. government debt show in 
which direction debt policy should be modi- 
fied to improve tax smoothing. Since the 
estimation is more constructive and because 
of robustness considerations, the emphasis 
will be on the analysis of optimal portfolios, 
rather than on testing. 

For the United States, several questions 
about the debt structure are interesting un- 
der positive as well as normative aspects. 
The first question relates to the absence of 
indexed debt (see, e.g., Stanley Fischer, 
1983; Bohn, 1988). Nonexistence may be 
interpreted as indicating that a risk-free as- 
set cannot be created, or it may be an 
equilibrium phenomenon, meaning that the 
government could issue indexed debt if it 
wanted to. The example of Britain suggests 
that indexed debt could be issued easily. 
The paper shows that, indeed, the specific 
state contingency implemented by relating 
the real value of government debt to infla- 
tion has desirable hedging properties. Thus, 
nonindexation is consistent with optimal 
policy.” 


2This abstracts from issues of time consistency, 
which would favor indexation (Guillermo Calvo, 1978; 
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A second question is about the maturity 
structure of debt. In a discrete-time frame- 
work, the real value of debt with maturity 
greater than one period varies with nominal 
interest rates. I show that this type of con- 
tingency is also desirable for a welfare-max- 
imizing government, though the evidence is 
weaker than that in favor of nominal debt. 

Third, one may ask whether the govern- 
ment can improve on the current practice of 
issuing only domestic debt securities. 
Though ‘an exhaustive survey of alternatives 
is beyond the scope of this paper, the two 
cases of stocks and foreign-currency debt 
are explored. One finds that the govern- 
ment could indeed hedge against shocks 
due to cyclical fluctuations by taking a short 
position in the stock market. Movements in 
some foreign interest rates also have desir- 
able hedging properties, though there is lit- 
tle support for taking exchange-rate risk. 
Overall, it seems that the government could 
improve welfare by looking outside the set 
of dollar-dominated debt securities in struc- 
turing its liabilities. 

The results on debt management rely 
heavily on the optimality of tax smoothing. 
Though the notion that excess burden in- 
creases on the margin with increasing tax 
rates is familiar from atemporal models of 
taxation, tax smoothing is not always opti- 
mal in dynamic models (see Olivier Blan- 
chard and Fischer, 1989 Ch. 11.3; Richard 
Tresch, 1981 Part III). In general, excess 
burden may depend on other variables in 
addition to current tax rates (see, e.g., Lucas 
and Stokey, 1983). The tax-smoothing ap- 
proach may work well for economies where 
taxes are labor income taxes or other taxes 
with largely static effects, but it is probably 
less appropriate for cases in which taxation 
has significant effects on interest rates or on 
capital accumulation (see Lucas and Stokey 
[1983] and Kenneth Judd [1989], respec- 
tively). In particular, if debt management 
affected interest rates, the qualitative na- 
ture of the government’s optimization prob- 
lem would change significantly, because the 


Lucas and Stokey, 1983; Bohn, 1988). Time-consistency 


issues have intentionally been omitted from the paper, 
because they would distract from the analysis of the 
government debt portfolio as a whole. 
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government would no longer behave as a 
price taker on financ-al markets. Thus, the 
normative results of tais paper should apply 
to economies where debt management has 
few macroeconomic =ffects and where the 
current tax rate is the main determinant of 
excess burden, but hey may have to be 
interpreted more cautiously otherwise.° 

A final simplifying assumption is that in- 
dividuals are risk-nectral. All expected re- 
turns are then tied to the rate of time 
preference, making -hem exogenous in a 
straightforward way. Somewhat surprisingly 
for a paper concerned with hedging, this 
assumption does not seem to affect the re- 
sults; but it helps ta focus on the govern- 
ment’s problem. Regardless of the degree 
of individual risk aversion, the convexity of 
excess burden makes the government act 
“more risk-averse” tnan taxpayers and im- 
plies that tax smootHing is a close approxi- 
mation to the optimal policy. Since the 
risk-neutral case shows most clearly how the 
government’s hedginz motive derives from 
the (added) concavity in the welfare func- 
tion created by distomtionary taxes, risk neu- 
trality is imposed throughout (see the end 
of Section I for more details). 

The paper is orgarezed as follows. Section 
I sets up a simple nodel that provides a 
tax-smoothing argument for why govern- 
ments may want to Fedge against economic 
uncertainty. Tests for tax smoothing are in 
Section II. Section HI derives an equation 
for the optimal liabiity structure, and Sec- 
tion IV contains estimates of optimal port- 
folios containing ncminal, long-term, and 
foreign-currency bords and stocks. Section 
V summarizes the results. 


I. A Framework for Analysis 


Barro (1979) has shown that. with distor- 
tionary taxes, the government should smooth 
tax rates over time. This section shows that 
Barro’s approach generalizes in a stochastic 
_environment to tax snoothing over states of 
nature. The government behaves as if it is 


3At least for the choice of maturities, the assump- 
tion that debt manageme nt has little macroeconomic 
effect seems empiricalty defensible; see Franco 
Modigliani and Richard Sutch’ (1966). 
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averse to the risk of changing tax rates, even 
if all individuals are risk-neutral. 

I consider a model similar to Barro (1979), 
except that risky securities are added. In 
period t, identical, infinitely lived individu- 
als maximize 


(1) U =E, L PCi 
j20 


where 0 < p <1 is a discount factor and c,,, 
is consumption in period ¢ + j. They own a 
stream of endowments Y,,; and may trade 
K +1 assets. Let A,, be the quantity of 
asset k (k =0,1,..., K) purchased in period 
t, P, be the price of asset k in terms of 
consumption goods (ex dividend), and f4; x 
j2z1, be the stream of cash flows (interest 
payments or dividends) in future periods. 
Let returns be Panes DY Tein = (Pitik 
+ fisi, k)/ Pek7 

Individuals S taxes on endowments at a 
rate 7,. I assume that taxes are distortionary 
(e.g., because of wasteful efforts of evading 
or sheltering income). Following Barro 
(1979), the excess burden of taxation is sum- 
marized by a loss function h(7,), which indi- 
cates the fraction of endowment “wasted” 
when taxes are 7,. Then, the individual bud- 
get constraint is 


(2) c,+ LP, kr, x =Y |1- — h(7,)] 


T Siint + fer) A-1," 
k 


Individual optimization implies asset-pricing 
equations p, , = pECD,+41,4 + fr4i,x) for all 
k. That is, all expected returns must be 
equal: 


1 
(3) EEan for all k. 


It is convenient to introduce several spe- 
cific securities that will be analyzed later. 
First, let k = 0 be a risk-free (in real terms, 
Les price-level-indexed) one-period security 
that has a price p, o= 1 and return r=1/p 
— 1> 0 for all t. Then one can define excess 
returns Â+, k=l 7 ON assets kzi. 
Individual optimization can be summarized 
by E,F tt+1,k ™ 0. 
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Second, I want to discuss assets with re- 

turns defined in terms of a nominal unit of 
account, money. However, I do not want'to 
focus on asset-pricing issues specific to 
monetary models, nor'do I want to discuss 
optimal monetary policy. While both issues 
are important topics in themselves, all I 
need here is a well-defined price level. 
Therefore, I will just assume that the price 
level P, and the rate of inflation, 7, = 
log(P, /(P,_ ), follow some stochastic pro- 
cesses. These assumptions could be moti- 
vated more rigorously as a limit of a cash- 
in-advance model with “small” monetary 
sector. 
“Third, some securities may be denomi- 
nated in a foreign currency. Given risk-neu- 
‘trality of domestic individuals, the closed- 
economy market-clearing conditions are not 
essential. That is, the existence of “other” 
individuals abroad does not change the 
model significantly. Whenever necessary, I 
will therefore assume that payoffs of some 
securities-may depend on variables defined 
within a foreign economy. 

The government uses tax revenues T,= 
7,Y, to finance government spending G, and 
to service the government debt. I assume 
that the government ‘can issue ‘arbitrary 
quantities D,, of the securities k at the 
market price. The government budget con- 
straint is 


(4) T,=7,Y,=G,t+ Vip « PF) Disick 
k 


= Pik Dek: 
k 


Individual welfare can be written as a func- 
tion of government policy by substituting (4) 
and (2) into the individual objective func- 
tion and dropping irrelevant terms:* 


(5) U =E, È p[i = A(7,4;)| }. 


“To be exact, LD e t SeA Ak Dr 
EX; oP! G,sj should be added to the right-hand side 
of (5); but since this is an additive, exogenous term, it 
does not affect decisions. - 
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The government chooses tax rates and debt 
structure to maximize (5) subject to (4). In 
effect, the government objective is to mini- 
mize the expected’ present value of excess 
burden. The first-order conditions for opti- 
mal policy are 


(6a) E,[h’(7,4,)]=A'(7,) forall k=0 


(6b) pE, | h'(T,+1)(1+ ron) =h'(1,) 
for all k > 0. 


To simplify, assume that excess burden is 
quadratic; that is, h(r,)=(h /2)r7 for some 
h>0. Then (6a) implies tax smoothing over 
time, E,7,,;=7,, as in’Barro (1979), or- in 
terms of stationary variables, 


(7a) 


This also determines the path of total debt. 
Using (3), equation (6b) can be rewritten as 


E,{A7,4,(1 + r41,k) ou 
for all k > 0. 


E (AT) = 0. 


(7b) 


' Notice that this equation—though not 
(7a)—woulc hold even if it were impossible 
to issue a risk-free asset. Combining (7a) 
and (7b), one obtains ` 


(8) E( At hia1. x) me EC tihi) 
= Cov, (Tio tik) 
=0 


where f 41 = 7,4, — E,T,+1 is the innovation 
in tax rates. That is, the government should 
stabilize taxes across possible states of na- 
ture. This implies a zero conditional covari- 
ance between taxes and returns on all avail- 
able securities. The optimality conditions 
(7) and (8) will be tested in Section II. They 
implicitly characterize the optimal debt 
structure, since taxes are a function of debt 
policy through the. budget constraint. This 
link will be made explicit in Section III. 

At this point, a remark on the assumption 
of risk-neutrality may be appropriate. If 
preferences (1) wete replaced by a time- 
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additive concave utility function (with money 
and foreigners excluded for the purpose of 
this argument), the ratio of t +1 and t-dated 
marginal utilities would enter (3), as in 
Lucas (1978). Through a similar modifica- 
tion of (5), the same ratio of marginal utili- 
ties:would entér equations (6)-(8). If one 
takes such a consumption-based capital-as- 
set-pricing model as the maintained hypoth- 
esis for normative analysis, a variation of 
Rajnish Mehra and Edward Prescott’s (1985) 
argument (based on the fact that consump- 
tion growth has low variance) can be used 
to show that changes in the marginal rate of 
substitution are quantitatively unimportant 
(details available from the author). Inde- 
pendently from the Mehra-Prescott argu- 
ment, the fact that tax rates and consump- 
tion have low correlation in U.S. data {less 
than 0.05 in absolute value for the period 
1954-87, using nondurables) suggests that 
risk-neutrality is a justifiable simplification. 
Risk aversion does not affect the basic’ intu- 
ition that, because of the convex excess bur- 
den in its objective function, the govern- 
ment should smooth tax rates.” 


II. Optimal Debt 


In this section, the first-order conditions 
for taxes, (7) and (8), are tésted for quar- 
terly postwar United States data. The basic 
sample periods are 1954:2-1987:4 for do- 
mestic series and 1973:1—1987:4 for interna- 
tional series. The measure of tax rates is the 
ratio of federal tax revenue to GNP.® 


`The reference to Mehra and Prescott (1985) leads 
to a-more fundamental problem, however, because 
their point was that consumption-based asset pricing 
cannot explain observed risk premiums. If the Mehra- 
Prescott puzzle means that actual risk premiums are 
too high for no good reason, one may speculate whether 
the optimal debt portfolios should be biased toward 
assets that promise low expected returns. However, if 
the “true” model that justifies the observed high equity 
premiums is similar to the Lucas (1978) model in that 
the same risk factor (perhaps something other than 
marginal utility of consumption) appears in both con- 
sumer and government first-order conditions, the basic 
intuition will still apply. : 
The reason for using this measure instead of, say, 
marginal federal income tax rates is that it is unclear 
what “the” marginal tax rate is on aggregate. Taxes are 
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The use of quarterly data seems neces- 
sary to obtain reasonably precise estimates 
of the covariances, but it may be somewhat 
problematic if tax policy is set less fre- 
quently. (The optimal debt portfolios in the 
next section will.turn out to be more robust.) 
If a quarterly revision of tax policy in light 
of news about all relevant variables is not 
feasible in practice (e.g., for technical rea- 
sons [tax laws specify annual payments] or 
because of legislative delays), significant 
correlations in quarterly data should not be 
interpreted as rejections of optimality but, 
rather, as an indication that tax smoothing 
could be improved, if it were possible to 
adjust tax laws more frequently. Still, equa- 
tions (7) and (8) yield the most natural and 
direct tests of the tax-smoothing model, even 
if negative conclusions may have to be inter- 
preted cautiously. 

The random-walk condition (7a) will only 
be discussed briefly, since it has been tested 
before (see Chaipat Sahasakul, 1986; Barro, 
1981). Over the postwar sample, the change 
in tax rates has an insignificant mean of 
0.023 percent (¢ = 0.59) and small autocor- 
relations. The series was regressed on its 
own lags, on various lagged asset returns, 
and on monetary data (those used in Sec- 
tion IV). The hypothesis that changes in tax 
rates are unpredictable was: not rejected ‘in. 
any of the tests; the details are therefore 
omitted. These results confirm Barro (1981), 
but they are in contrast to the results of 
Sahasakul, who finds ‘evidence: against tax 
smoothing for a sample period that includes 
World War II. However, if one looks more 
carefully, the results do not contradict Sa- 
hasakul: simple tax-smoothing models ap- 
parently have some problems in explaining 
World War II data (which is not the subject 


levied not just on income but also on other activities; 


rates differ across individuals; and tax laws contain a- 
multitude of exemptions and exceptions. On aggregate, 
however, all taxes must be paid out of the available 
economic resources, GNP. A higher revenue:GNP ra- 
tio implies higher tax rates somewhere, no matter how 
tax laws are structured in detail. Therefore, I consider 
changes of the revenue:GNP ratio as indicators of 
changes in tax rates and excess burden. See Barro 
(1981) for further discussion. 
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of this paper). On the other hand, (7a) 
cannot be rejected for the postwar period.’ 

Equation (8) implies that innovations in 
tax rates should be uncorrelated with all 
innovations in returns. This is a prediction 
of the tax-smoothing model that, to my 
knowledge, has not been tested before. 
For the implementation of the test, notice 
that all linear combinations of security 
returns—differences between pairs of re- 
turns in particular—must also be uncorre- 
lated with tax changes and that fixed com- 
ponents of returns do not matter for return 
innovations. Any real return is the differ- 
ence of the nominal return and inflation. 
The innovation in inflation is —1 times the 
real return innovation of a one-period nom- 
inal bond. Thus, provided a one-period 
nominal bond (a three-month Treasury bill) 
is in the set of assets, equation (8) implies 
zero covariance between innovations in tax 
changes and all nominal returns. Inflation 
data are not needed. Using linear combina- 
tions and dropping known components are 
also useful for cases in which returns have 
several parts (e.g., for bonds with different 
maturities and for foreign-currency invest- 
ments). 

Covariances between innovations of tax 
rates and returns were computed for a vari- 
ety of widely available securities. The return 
series are the change in three-month Trea- 
sury bill yields (symbolized DTB), the nomi- 
nal return on long-term Treasury bond 
(LRET), nominal stock returns measured by 
the Standard and Poor’s (S&P) 500 index 
(STOCK), and changes in German and 


7Sahasakul (1986) examines contemporaneous rela- 
tions between tax rates and other government-sector 
variables instead of the predictability of tax rate 
changes, and he uses marginal income-tax rates instead 
of the revenue:GNP ratio as basic data series. He finds 
a positive relation between temporary military spend- 
ing and tax rates, which indicates a violation of equa- 
tion (7a). Using the data provided in his paper, I find 
that tax rates are indeed not a random walk over his 
sample period (1937-82); they are positively autocorre- 
lated. However, for the postwar period (1954-82), (a) 
the positive relation between temporary military spend- 
ing and tax rates vanishes, and (b) one cannot reject 
that tax rates follow a random walk. A Chow test 


indicates a significant break in the autocorrelation co- 


efficient. 
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Japanese exchange rates (EG, EJ), money- 
market yields (SG, SJ), and long-term gov- 
ernment bond yields (LG, LJ)$ Yield 
changes in nominal bonds are used as prox- 
ies for capital gains, which are the uncertain 
part of bond returns.” Exchange rates cap- 
ture the risky component of the return on a 
three-month foreign money-market invest- 
ment. Changes in foreign yields measure 
the difference between long- and short-term 
foreign bond-market investments. (Though 
some of the series represent linear combi- 
nations or only the uncertain components of 
returns, I will refer to them as “returns.”) 
Finally, to test whether indexed debt might 
improve tax smoothing, the innovation in 
inflation (GNP-deflator; symbolized P) is 
considered as a potential return series. 
Innovations in returns and tax rates were 
computed from vector autoregressions 
(VAR) of tax rates and returns, generally 
with four lags. For each return series, Table 
1 displays the correlation, p,, between inno- 
vations in taxes and the return, the covari- 
ance denoted by c} = E,(7,,:7,41,) and its 
asymptotic standard error denoted by std-c,. 
In addition, Table 1 indicates (by + sym- 
bols) whether an ordinary least-squares re- 
gression of tax rates on the current return 
and lagged values of both series has a sig- 
nificant coefficient on the current return. 
This test is motivated by the fact that the 
zero-covariance restriction (8) implies a 
value of zero for this regression coefficient. 
As Table 1 shows, tax smoothing is clearly 
rejected with both domestic bond series, 
with stock returns, with the inflation se- 


8Data sources are national income accounts for 
macroeconomic series, the International Monetary 
Fund for international data, the Center for Research 
in Security Prices (CRSP) for Treasury-bill yields, and 
Ibbotsen Associates for stock and bond returns. Do- 
mestic returns are based on the last day of a quarter, 
and international returns are based on the last month 
of a quarter. Details are available from the author. 
Because of changing duration, this ts only an ap- 
proximation for finite intervals. For the Treasury-bill 
market, exact three-month holding returns on six-month 
bills were available from 1959 on. Estimated results for 
return innovations computed from yield changes and 
from holding returns were very similar. Thus, the longer 
series on yield changes was used throughout the study. 
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TABLE 1—CORRELATIONS WITH TAX RATES 


Return series Pk Ch std-c, 
A. 1954:2~-1987:4: 
DTB —0.201 —0.243 0.106**** 
LRET —0.281 —0.627 0.206****** 
STOCK ~0.196 —0.646 0.289**T~ 
P —0.294 ~0.051 0.016****** 
B. 1973:1~1987:4: 
EG 0.091 0.276 0.393 
EJ 0.019 0.056 0.377 
SG —0.223 0.029 0.017* 
LG — 0.431 0.031 0.010****** 
SJ ~0.255 —0.027 0.014** 
LJ - ~0.141 —0.096 0.088 


Legend: p, is the correlation between return series k 
and the tax rate, c, is the covariance between return 
series k and the tax rate, and std-c, is the asymptotic 
standard error of c. 

Notes: Stars indicate rejection of Hy: c} =0 at the 
10-percent (*), 5-percent (**), or 1-percent (***) sig- 
nificance level based on the asymptotic standard error 
std-c,. Plus signs (+, ++, or +++) indicate rejec- 
tion of the same hypothesis at the same, respective, 
significance levels, based on a regression of tax rates on 
current returns and four lags of both series. 


ries, and with some of the international 
series. That is, the government would have 
been able to improve tax smoothing by hav- 
ing different quantities of the corresponding 
securities in its portfolio of liabilities.!° Un- 
fortunately, these statistical rejections pro- 
vide little information on how policy could 
be improved, and they may be sensitive to 
institutional rigidities in the tax-setting pro- 


To verify that the results are not due to misspeci- 
fication of the regressions or the assumed availability 
of a risk-free asset, a test based on equation (7b) was 
implemented by testing whether the product series 
At, 410 141,% T Tr41,) has a zero mean for any pair of 
security returns (k,/). [The product Ar, +E) 
was not used directly, because its mezn is dominated 
by the Ar,,, component; a test would differ little from 
testing (7a).] Series LRET and STCCK, for which 
complete return data are available, were taken as secu- 
rities k, the three-month Treasury-bill as 1. The ¢ 
Statistics of —3.71 and —2.00 confirm the rejections 
reported in Table 1 at the same levels of significance (1 
percent and 5 percent, respectively). Since the simple 
tests already yield strong rejections, more elaborate 
tests (e.g., following Lars Hansen and Kenneth Single- 
ton [1983]) seem unnecessary. 
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cess.!! The next section will therefore ex- 
plore how the optimal structure of govern- 
ment liabilities can be computed explicitly. 


III. The Optimal Structure 
of Government Liabilities 


To obtain a solution for the optimal 
debt structure, the link between innovations 
in debt and tax rates must be made explicit 
(i.e., a formula for innovations in tax rates, 
7,41, is needed). Let y, be the growth in 
output (= endowments; empirically, GNP), 
let ý be the mean, and assume y <r. 
Denote the new information about peri- 
od-(t +j+1) output growth by Ñ i= 
E,41Yt+14j 7 Er¥e+1+i, denote the new in- 
formation about period-(t + j +1). govern- 
ment spending relative to output by 2,,4,; 
=(E,4:Gr+14j —E Graiap/ Y,, and let d, , 

=p,,D,;,/¥, be the ratio of security-k 
debt to output. Then, an approximate solu- 
tion for the period-(¢ + 1) innovation in tax 
rates is 


(9) Tui T1 Ea 


=(1—)-exp(- F) 
Lfd 
k 


jA 
t,k + » PS rai 
j2z0 


TT 2; WP a4; 


j20 


Ult may be tempting to use the correlations in 
Table 1 to assess the economic significance of the 
rejections. The fact that most correlations are below 
0.3 in absolute value (except for LG) suggests that the 
optimal supply of any one security would have reduced 
the variance of tax rates by less than 10 percent (18.6 
percent for LG; 11.7 percent for all four securities in 
Part A of Table 1 jointly, estimated using a four-lag 
VAR with tax rates and all four return series). How- 
ever, as noted earlier, one should be cautious in inter- 
preting the results based on high-frequency tax-rate 
data. If there are short-term institutional rigidities in 
tax policy, marginal changes in the liability portfolio (to 
optimize debt management) might not translate into 
changed tax policy on a quarterly basis. The tests show 
that the variance of tax rates has not been minimized, 
but they do not reveal what is responsible for the 
excessive variance. 
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where 0< y= A exp(— y) <1 is a discount 
factor. 

The derpations is s conceptually straightfor- 
ward but lengthy.’* The idea is that the 
present value of tax revenues must cover 
initial debt plus the present value of spend- 
ing. Any innovation in current or future 
government spending and any unexpected 
change in the value of debt therefore forces 
the government to adjust tax revenues even- 
tually. Because of tax smoothing, a fraction 
(1— ys) of the adjustment takes place imme- 
diately. In addition, since the present value 
of tax revenues depends on the path of 
output, tax rates have to be changed when- 
ever new information about current or fu- 
ture output is received. As a result, tax rates 
are increased if the value of debt increases 
unexpectedly, if estimates of future govern- 
ment spending are revised upwards, or if 
output is lower than expected. 

It may be worth noting that this argument 
applies even if tax rates cannot be adjusted 
every period. Then, optimal debt policy 
would have to stabilize the Lagrange multi- 
plier of the budget constraint, which would 
replace the marginal excess burden h’(7) in 
the first-order condition (8). This multiplier 
would be perfectly correlated with the 
right-hand side of (9), leaving the results for 
optimal debt structure unchanged. 

Using the tax-rate formula (9) in the 
first-order condition (8), one obtains a sys- 
tem of K equations for the K “risky” secu- 
rities, d, p, issued by the government: 


> Cov, (Ê+ Êi kdai 
I 


+ Cov 1G Prie 2 P Pisi) 


j20 


— m Cov, [frie x bears] =0 


jJ20 


for all 1, where w, =[exp(¥)/(1— ẹ)]r, is a 


Details are available from the author. Quadratic 
excess burden (7) is assumed, and the growth rate of 
output, y, should be stationary but not necessarily 
independently and identically distributed. 
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weighting factor. To simplify, let d; be the 
vector of risky government debt securities 
d, let &, be the variance-covariance ma- 
trix of returns, assumed to be nonsingular, 
and let $., and x, , be the vectors of 
covariances between returns and the pre- 
sent-value expressions X; o% f, +; and 
dj >0P iĝ 414;, respectively. Then, the arene 
equation can be restated as 2,d,+2, 


WÈ, a and the optimal debt structure is 


(10) d,=w,37 CE, E Eg ns 


Thus, the general formula for optimal debt 
structure involves covariances of returns 
with innovations in output and government 
spending. 

Output: (or equivalently, aggregate in- 
come) matters in this model, because it 
forms the tax base and because high tax 
rates cause distortions. Given desired levels 
of revenues, tax rates must increase if out- 
put falls. Since tax rates are smoothed over 
time, any news about future output is also 
relevant. Consequently, permanent changes 
in output have much larger effects than 
temporary changes. Unexpectedly high gov- 
ernment spending has an additional effect 
on tax rates. The aptimal debt structure is 
chosen to hedge against these sources of 
uncertainty and thereby minimizes fluctua- 
tions in tax rates. 


IV. Estimates of the Optimal 
Debt Structure 


In this section, formula (10) will be used 
to explore the optimal structure of govern- 
ment liabilities. 


A. Methodology 


Equation (10) identifies uncertain output 
and uncertain government spending as 
sources of risk. Since the two covariance 
vectors enter as weighted sums in this equa- 
tion, the optimal debt structure can be in- 
terpreted as the sum of two components. A 
priori, it seems likely that output variation 
is a quantitatively significant source of risk; 
the cyclical volatility of budget deficits is 
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well documented. Interesting correlations of 
returns with output (GNP) were indeed 
found, while correlations with spending 
turned out to be small and largely insignifi- 
cant. (Details are in an earlier version of 
this paper, which is available on request.) In 
reporting empirical results, I will therefore 
focus on the output component. This is done 
formally by imposing the auxiliary assump- 
tion that correlations betweer government 
spending and debt are approximately zero, 
È}, = 0. Then, the optimal structure of debt 
is proportional to the vector s=%, "+, ,, 
which depends only on the variance-covati- 
ance matrix of innovations in output and 
security returns, d, = w,S. 

It is useful to keep the factor of propor- 
tionality, w,, separate because w, depends 
critically on the discount rate applied to 
future output. For a real discount rate of 
1—%=1 percent per quarter, 7, = 0.2, and 
exp(y)=1, the proportionality factor is w, 
=[exp(y)/(1— y)lr, = 20; but if 1 ~ y = 0.5 
percent were assumed, the factor would 
double. Recalling that 2%, is the covari- 
ance between the present ‘value of output 
and returns, the discount factor also enters 
into the vector s, but it turns out that dif- 
ferent discount rates have a negligible effect 
on the estimates. (To save space, results for 
s will be shown for y = 0.99 only.) 

Thus, I will concentrate on computing the 
vector s, which indicates whether securities 
enter with positive or negative sign into the 
optimal portfolio and in which relative 
quantities. For the interpretation, a range 
of “reasonable” weights, say between 10 
and 40, may be applied to compute d,. The 
main econometric problem in estimating 
x, , or s is to identify the innovations, in 
particular the change in expectations of “far 
out” realizations of output growth, which 
enter %,, through the expected-present- 
value expression /j>o0’/9;414;. Vector-au- 
toregression (VAR) techniques were used 
because they seem ideally suited for the 
tasks of extracting the covariance structure 
and computing projections of a multivariate 
process. 

For all securities, I started with z minimal 
bivariate VAR including only GNP growth 
(for y,) and a single return series, using 
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quarterly U.S. data from 1954:2 to 1987:4, 
including a constant and four lags. Several 
alternative versions were estimated to see 
whether the results are robust to changes in 
specification. Finally, I computed optimal 
debt structures for several return series si- 
multaneously, with: and without additional 
information variables. These alternative 
specifications will only be displayed when 
they affect the conclusions from the basic 
bivariate process. 

Consistent point estimates and asymp- 
totic standard errors of the elements of £, 
and s (denoted by c, and s, in the tables) 
can be obtained as functions of the VAR 
coefficients and the residual variance-co- 
variance matrix (see Theodore Anderson, 
1958; Takeshi Amemiya, 1985; Peter 
Schmidt, 1976). If the VAR includes only y, 
and a single return series, the signs of c, 
and k, can also be determined with a sim- 
pler auxiliary regression of y, on the current 
return and lagged values of both series. 
Details are available from the author upon 
request. 


B. Domestic Debt Securities 


Currently all U.S. government liabilities 
are nonindexed dollar-denominated debt 
securities with various maturities. To focus 
on indexation first, consider a debt portfolio 
with only two securities, an indexed bond 
(k =0) and a one-period nominal bond (k 
= 1). Since the real return on a one-period 
nominal bond is the known promised yield, 
the innovation in real return is —1 times 
the rate of inflation. Thus, the covariance 
between the present value of GNP and in- 
flation determines whether nominal debt is 
desirable as a hedge. Estimates based on 
VAR’s with GNP growth and inflation are 
displayed in Table 2 (where all estimates 


The alternative estimates used eight instead of 
four lazs, sample periods 1954:2-1972:4 and 
1973:1-1978:4 (intended to capture potential breaks in 
the processes), or one or more additional variables in 
the information set (other returns, military spending, 
money supply M1, the monetary base, and import 
prices). All macroeconomic variables are log-dif- 
ferenced. 
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TABLE 2—-RETURNS ON NOMINAL BONDS 


_ VAR Pk Ck 
1 0.853 0.511 
2 0.534 0.220 
3 0.540 0.237, 
4 0.461 0.166 
5 0.388 0.133 


DECEMBER 1990 
std-c k Aya std-s k 
0.254** ++ 3.28 1,59** 
0.102** 1.95 0.80** 
0.096** 2.00 O77 
0.075 ** 1.63 0.70** 
0.069" 1.34 0.68** 


Legend: p, is the correlation between a return series and the present value of output; 
c, is the covariance between a return series and the present value of output; std-c, is 
the asymptotic standard error of c,; s, is the indicator of optimal supply of a security, 
as defined in Section IV-A; and std-s, is the asymptotic standard error of s,. 
Notes: The columns of c, and std-c, have been multiplied by 10* to improve 
tTeadability. Stars indicate rejections of c, = 0 or s; = 0 at the 10-percent (*), 5-percent 
i (**), or 1-percent (***) significance level based on the asymptotic standard errors 

(Wald tests). Plus signs (+, + +, or + + +) indicate réjection of the same hypotheses 
at the same, respective, significance levels, based on a regression of GNP growth on 
current returns and four lags of both series. The VAR specifications are: 

1) bivariate with GNP growth and inflation, sample 1954:2-—1987:4, four lags; 

2) with money supply M1, money base, and DTB as information variables; other- 


wise as for VAR 1; 


3) with M1, money base, military spending, and DTB as information variables; 


otherwise as for VAR 1; 


4) with M1, money base, import prices, and DTB as information sole other- 


wise as for VAR 1; 


5) with M1, money base, import prices, military spending, and DTB as information 


variables; otherwise as for VAR 1. 


have been multiplied by — 1; i.e., a positive 
value means that nominal bonds should be 
issued). 

In the basic specification, VAR 1, the 
correlation between the innovations is 0.85. 
This seems extraordinarily high and may 
reflect the omission of other variables that 
predict output and inflation; but even if 
various other variables are included (see 
VAR’s 2-5), the correlations remain. around 
0.50. The estimates are not only statistically 
but also economically significant. Taking s, 
= 1.34 (the lowest estimate) and a propor- 
tionality factor of w, = 20, the ratio of nom- 
inal debt (with one-quarter maturity) to 
GNP should be about d, = 26:8, as opposed 
to the current. debt:GNP ratio of around 
0.50. One has to keep in mind, though, that 
time-consistency issues are not modeled 
here, which may reduce the optimal amount 
of nominal debt. Still, the results suggest 
that the optimal solution may require the 
government to hold indexed bonds and to 
issue nominal debt in an amount far exceed- 
ing its total debt. If such simultaneous large 
long and short positions are impractical, the 


current policy of issuing only nominal bonds 
may be interpreted as a corner solution. 
Alternatively, it may be that the three-month 
maturity of nominal bonds implicit in quar- 
terly data is too low. 

In. analyzing the optimal maturity distri- 
bution, I will limit the study to the choice 
between one-period, two-period, and long- 
term bonds, represented by three-month 
Treasury bills, six-month Treasury bills, and 
the longest-term Treasury bonds. Because 
of the high correlation between interest 
rates, more variables would probably not 
add any new insights. 

To obtain’ implications for observable 
(meaning nominal) return series and to pre- 
vent a repetition of the indexation question, 
it is convenient to restate the term structure 
in terms of three-month Treasury bills and 
forward contracts. Given that nominal debt 
should be issued (as determined above), the 
question is only whether some of it should 
have maturities longer than three months. 
The real return innovation of the long-term 
bond relative to the three-month bill is given 
by its nominal return, LRET. The real re- 
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TABLE 3—MATURITY CHOICE 





VAR Px Cx std-c, Sp std-s, 


A. Changes in Treasury-bill rates: 
1 


0.084 0.268 0.590 0.38 0.83 
2 0.568 1.553 0,747** tt 2.41 Lit 
B. Returns on long-term bonds: 
1 0.155 0.70 1.507 0.039 0.060 


2 0.334 1.796 1.397 0.074 0.057 


Legend: See Table 2 for notation. . 

Notes: VAR’s 1 are bivariate processes with GNP growth and a return series (DTB or 
LRET), usiag sample daza for 1954:2~1987:4, including four lags. VAR’s 2 use eight 
lags instead The columns of c, and std-c, have been multiplied by 10° in Part A and 
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by 104 in Part B to improve readability. 


turn on a six-montk, relative to a three- 
month, Treasury bill is the six-month yield 
at the beginning of the period minus the 
three-month yield at the end of the period. 
Since six-month yielcs are not available for 
the entire sample, the return innovation is 
proxied by the change in three-month yields, 
denoted by DTB (Fo: 1960-87, results were 
similar to those with the exact return series 
and are therefore not reported.) 

Innovations in LEET and DTB can be 
interpreted as return innovations on a 
three-month forward contract on the secu- 
rity; that is, correlations with the present 
value of output determine whether the gov- 
ernment has a hedging demand for forward 
contracts. An optimal short position to- 
gether with the supply of three-month Trea- 
sury bills would establish the optimality of 
issuing longer-term tonds. 

Estimates are displayed in Table 3. All 
correlations betweer the present value of 
output with DTB ard LRET have positive 
signs, which indicates that two-period or 
long-term nominal bonds (or equivalently, 
forward contacts) should be issued. Unfor- 
tunately, only the estimate for DTB based 
on an eight-lag VAR is significant. All the 
positive correlations seem to be due to a 
delayed reaction of cutput to interest rates, 
which comes out strcngest in processes with 
long lags. Similar results were obtained 
when the optimal supply of three-month, 
six-month, and long-term bonds (or two of 
the three) was estimated jointly as a vector; 
only the estimate for nominal debt (vs. in- 


dexation) was significant. Overall, the point 
estimates provide some support for issuing 
long-term debt, but because of the large 
standard errors, a variety of maturity distri- 
butions could be consistent with optimal 
policy. 


C. Nontraditional Government Liabilities 


There is no reason why governments 
should restrict their liabilities to nominal or 
indexed debt securities. In this section, I 
will consider two other classes of securities: 
stocks and foreign-currency debt. 

German mark- and Swiss franc-denom- 
inated “Carter-bonds” were issued by the 
U.S. government in 1978. More recently, 
yen-denominated debt has been proposed. 
Concentrating on marks and yen, I consider 
one-period, two-period, and long-term for- 
eign-currency bonds. Their real returns are 
linear combinations of nominal exchange- 
rate changes, nominal interest-rate changes, 
and domestic inflation. As in the domestic 
context, it is instructive to analyze the com- 
ponents, which may be interpreted as for- 
ward contracts. 

Results are displayed in Table 4, all for 
the period of flexible exchange rates 
1973:1-1987:4. The return series are the 
rates of dollar depreciation relative to marks 
and yen (EG and EJ, respectively) and —1 
times the change in short- and long-term 
German and Japanese interest rates (SG 
and LG for Germany, SJ and LF for Japan, 
respectively). 
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TABLE 4——-OPTIMAL PORTFOLIOS WITH FOREIGN-CURRENCY- 
DENOMINATED SECURITIES 

RET Pk Ch std-c, Sk std-s, 
A. One-period U.S. dollar-bonds and one- and two-period mark bonds: 

P 0.796 0.387 0.285 3.264 meso 

EG 0.324 2.353 2.154 0.079 0.064 

SG 0.251 0.076 0.075 2.606 1.587 
B. One-period U.S. dollar and mark bonds and long-term mark bonds: 

P 0.361 0.154 0.250 0.626 1.927 

EG 0.340 2.286 1.844 0.133 0.960** 

LG 0.544 0.081 0.051 7.258 = © 3.215** 
C. One-period U.S. dollar-bonds and one- and two-period yen bonds: 

P < 0.699 0.291 0.184 2.333 1.515 

EJ — 0.191 = 1.312 2.094 — 0.036 0.063 

SJ 0.795 0.188 0.075** 4.657 1.629*** 
D. One-period U.S. dollar and yen bonds and long-term yen bonds: . 

PY’ 0.540 0.226 ` 0.177 1.97 0.364 

EJ l 0.057 0.355 1.896 0.070 0.076 

LJ , . 0.598 0.091 0.048* 7.528 3.526** 


Legend: See Table 2 for notation. ` 


Notes: Each panel is estimated with a four-variable VAR with GNP and the three’ 
return series listed under RET, using sample data from 1973:1—1987:4, including four 
lags. The columns of c,.and std-c, have been multiplied by 104 to improve readabil- 


ity. 


. Since the question of whether there is an 
incremental benefit in issuing German or 
Japanese currency bonds arises in the pres- 
ence of domestic nominal bonds, I concen- 
trate on the multivariate framework. In Part 


A of. Table 4, nominal domestic bonds (with 


real returns contingent on —1 X inflation, 
P) are considered jointly with one- and 
two-period German bonds. The quarterly 
exchange-rate movements, EG, indicate the 
return on three-month investments in Ger- 
many relative to Treasury bills. The change 
in German interest rates, SG, proxies the 
return on six-month relative to three-month 
investments. The positive correlations and 
point estimates suggest that both variables 
may have hedging roles, but they are in- 
significant. . : 

. Significant. positive results were obtained 
by the analogous regressions with long-term 
German ‘bonds in Part B of Table 4 and 
with short- and long-term Japanese bonds 
in Parts C and D. (Bivariate VAR’s. with 
GNP growth and a single return series were 


similar: estimates for LG, SJ, and LJ are 
significant.) Interestingly, the optimal expo- 
sure to yield changes exceeds the optimal 
total exposure to exchange-rate risk in all 
cases.'* The optimal hedge would have to 
combine short positions in short-term for- 
eign securities with -larger holdings of 
longer-term bonds. Overall, exposure to se- 
lected foreign interest rates appears to be 
desirable. Though a more comprehensive 
analysis of foreign-currency debt is beyond 
the scope of this article, there seems to be 
some potential for improvements in United 
States debt. policy in this direction. 

Finally, stock prices are commonly con- 
sidered to be highly cyclical variables, which 
makes them natural candidates for hedging 
output risk. Given that nominal debt is pre- 


Exposure to exchange-rate risk may be desirable 
for reasons not considered here (e.g., for providing 
incentives or credibility in the context: of exchange-rate 
stabilization). 
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TABLE 5—OPTIMAL PORTFOLIOS OF NOMINAL BONDS AND STOCKS 


RET Pk C k 
A. Nominal bonds and stocks: . 

P 0.847 0.491 

Stocks 0.451 5.073 


B. Nominal bonds and stocks (five-variable VAR): 
P 0.660 0.334 
Stocks 0.520 §.198 


Legend: See Table 2 for notation. 


std-c, Sk std-s, 
0.253** 2.932 1.652** 
2.023 -0.050 0.034 
0.150** 2.031 1.035** 
1.876*** 0.074 0.033** 


Notes: Part A is estimated with a three-variable VAR with GNP, inflation, and stock 
returns, using sample data from 1954:2-—1987:4, including four lags. The VAR for Part 
B includes M1 and the monetary base as additional variables. The columns of Ck and 
std-c, have been multiplied by 10* to improve een: 


sent, optimal nominal debt and the optimal 
stock market position were estimated jointly 
in Table 5. Part A is based on a VAR with 
output growth, inflation, and stock returns 
(measured by the S&P 500 index). Part B is 
based on a VAR that includes the M1 money 
supply and the monetary base as additional 
variables. In both cases, the optimal debt 
structure includes a short position in stocks, 
which is significant in the larger process that 
is based on the better estimate of inflation. 
(Similar estimates were obtained in a bivari- 
ate VAR with GNP growth and stock re- 
turns only, which were significant at the 
1-percent level.) 

To my knowledge, a proposal suggesting 
government participation in the stock mar- 
ket has not been made before. Based on the 
statistical evidence linking stocks to the pre- 
sent value of output, a short position would 


provide a hedge against cyclical shortfalls in 


government revenue. However, when ex- 
ploring such nonstandard financing strate- 
gies, one may ask why the government 
should not go one step further and sell 
synthetic securities that are directly contin- 
gent on GNP. A theory of market incom- 
pleteness (beyond the scope of this paper; 
see Gale, 1990) would be needed to decide 
whether such securities could be issued. 
However, the empirical evidence suggests 
that innovative financing strategies, with ex- 
isting or synthetic securities, are worth ex- 
ploring. 


V. Conclusions 


The optimal structure of government debt 
has been analyzed in a stochastic environ- 
ment. In a setting with distortionary taxes, 
the government should smooth tax rates over 
states of nature as well as over time. This 
requires state-contingent government liabil- 
ities that provide a hedge against shocks to 
the budget. . 

For postwar U.S. data, tax smoothing as a 


positive theory of policy cannot be rejected 


on the basis of the time path of taxes. © 
However, a number of security returns are 
correlated with tax rates, leading to a rejec- 
tion on that basis. Estimates of optimal debt 
portfolios provide strong support for using . 
nominal, nonindexed, government debt, but 
provide only weak evidence on the maturity 
distribution. Moreover, it seems that the 
government could improve tax smoothing by 
having some nontraditional liabilities, like 
foreign-currency debt or a short position in 
the stock market. 


REFERENCES 


Allen, Franklin and Gale, Douglas, “Optimal 
Security Design,’ Review of Financial 
Studies, Fall 1988, 1, 229-63. 

Amemiya, Takeshi, Advanced Econometrics, 
Cambridge, MA: Harvard University 
Press, 1985. 


1230 | THE AMERICAN ECONOMIC REVIEW 


‘Anderson, Theodore W., An Introduction to 
Multivariate Statistical Analysis, New York: 

Wiley, 1958. 

Barro, Robert J., “On the Determination of 
Public Debt,” Journal of Political Econ- 
omy, October 1979, 87, 940-71. 

, On the Predictability of Tax-Rate 
Changes,” manuscript, University of 
Rochester and NBER, October 1981. 

Blanchard, Olivier and Fischer, Stanley, Lec- 
tures on Macroeconomics, Cambridge MA: 
MIT Press, 1989. ° 

Bohn, Henning, “Why Do We Have Nominal 
Government Debt?” Journal of Monetary 
Economics, January 1988, 21, 127—40. 

Calvo, Guillermo, “On the Time Consistency 
of Optimal Policy in a Monetary Econ- 
omy, Econometrica, November 1978, 46, 
1411-28. 

Fischer, Stanley, “Welfare Effects of Govern- 
ment Issue of Indexed Bonds,” in Rudi- 
ger Dornbusch and Mario Simonsen, eds., 
Inflation, Debt, and Indexation, Cam- 
bridge, MA: MIT Press, 1983. - 

Gale, Douglas, “The Efficient Design of Pub- 
lic Debt,” in R. Dornbusch and M. 
Draghi, eds., Capital Markets and Debt 
Management, Cambridge: Cambridge 
University Press, forthcoming 1990. 

Hansen, Lars and Singleton, Kenneth, “Sto- 
chastic Consumption, Risk Aversion, and 
the Temporal Behavior of Asset Returns,” 
Journal of Political Economy, April 1983, 


DECEMBER 1990 


91, 249-65. 


: Judd, Kenneth, “Optimal Taxation in Dy- 


namic Stochastic Economies: Theory and 
Evidence,” manuscript, Hoover Institu- 
tion, Stanford University, 1989. 

Kydland, Finn and Prescott, Edward, “Rules 
Rather Than Discretion: The Inconsis- 
tency of Optimal Plans,” Journal of Politi- 
cal Economy, June 1977, 85, 473-91. 

Lucas, Robert E., “Asset Prices in an Ex- 
change Economy,” Econometrica, No- 
vember 1978, 46, 1429-45. 

______, and Stokey, Nancy, “Optimal Fiscal 
and Monetary Policy in an Economy with- 
out Capital,” Journal of Monetary Eco- 
nomics, July 1983, 12, 55-93. 

Mehra, Rajnish and Prescott, Edward, “The 
Equity Premium: A Puzzle,” Journal of 

- Monetary Economics, March 1985, 15, — 
145-62. 

Modigliani, Franco and Sutch, Richard, “In- 
novations in Interest Rates Policy,” 
American Economic Review, May 1966, 
56, 178-97. 

Sahasakul, Chaipat, “The U.S. Evidence of 
Optimal Taxation Over Time,” Journal of 
‘Monetary Economics, November 1986, 18, 
251-75. 

Schmidt, Peter, Econometrics, New- York: 
Marcel Dekker, 1976. 

Tresch, Richard, Public Finance: A Norma- 
tive Theory, Plano, TX: Business ‘Publica- 
tions, 1981. 


A A lt RI tt a: 


Beneficial Concentration 


By ANDREW F. DAUGHETY~* 


Concentration is 2eld by many economists 
(and undoubtedly ty a large proportion of 
the population of roneconomists) to be so 
worrisome as to be a cause for concern or 
even action. This concern is based on an 
intuition derived mainly from ccnsidering 
the symmetric equilibria of models of multi- 
firm industry behavior. Typically, as the 
number of firms ir the symmetric equilib- 
rium increases, some measure of welfare 
rises (e.g., surplus, and some measure of 
concentration falls (e.g., the Herfindahl 
index). However, there are two general 
“causes” of concentration in industries (too 
few firms and sign-ficant disparities in firm 
sizes), and the foregoing intuition does not 
carry over to the asymmetric equilibria that 
often lurk about ir well-formulated models 
and which seeming:y reflect a more realistic 
picture of the word. In this paper, I show 
that social optimal ty may involve extensive 
asymmetry: uniformity of firm size may be 
cause for concern. 

This result leads to a number cf implica- 
tions. For example. concentration measures 
(such as the Herfirdah! index) provide little 
insight about welfare, mainly because (as 
will be shown) increases in such measures 
will sometimes refilect increases in welfare 
and sometimes refect decreases in welfare. 
A related implicat.on will hold for average 
firm profits and ccncentration measures. In 
fact, as will be seen below, the often ob- 
served empirical result of a positive correla- 
tion between vari@us measures of concen- 
tration and firm >rofits may indicate too 


*Departments of Economics and Management Sci- 
ences, College of Busiaess Administration, The Uni- 
versity of Iowa. This wcrk was supparted by NSF grant 
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thank George Neumarn, Jennifer Reinganum, Mike 
Scherer, Gene Savin, Ciff Winston, and an anonymous 
referee for their sugge tions and comments on earlier 
versions of this paper. 
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little asymmetry rather than too much. Fi- 
nally, a merger that reduces the number of 
firms can be welfare-enhancing even if there 
are no cost advantages to the merger itself. 

The analysis proceeds from a model that, 
in a very simple manner, varies the degree 
of asymmetry of the organization of the 
industry. Explaining why one asymmetric so- 
lution might be more reasonable than an- 
other is not of concern here. Rather, I will 
simply assume that such equilibria arise be- 
cause it is individually (and possibly mutu- 
ally) advantageous to the firms in the indus- 
try for this to occur. This paper focuses on 
the social consequences of such outcomes 
and the implications for policies associated 
with asymmetry (such as antitrust). 


I. An Asymmetric Industry Comprised of 
Identical Firms 


Consider an industry comprised entirely 
of identical firms. The reason for doing this 
is twofold. First, by eliminating interfirm 
differences, one can focus on attributes of 
the equilibrium, rather than characteristics 
of the technology or markets involved. Sec- 
ond, one eliminates any traditional social 
advantages that might accrue to asymmetric 
access to resource markets, technology, sit- 
ing, or other factors. 

Usually, when an industry comprised of 
identical firms is considered, one examines 
Q.e., restricts attention to) symmetric equi- 
libria. This seemingly natural assumption 
ignores the fact that there are conditions 
under which asymmetric behavior can be 
individually and mutually beneficial to firms 
that are otherwise identical. Thus, for ex- 
ample, if one firm develops an R&D facility 
or develops a sophisticated marketing group, 
it is not always advantageous for the other 
firms in the industry to do likewise. Much 
depends upon the behavior of the potential 
leader, for by readily licensing new prod- 
ucts, or by focusing the marketing cam- 
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paigns on interindustry rather than intra- 
industry competition, the residual firms may 
choose to follow rather than also to invest 
in such types of capital. This occurs because 
firms in the real world have the flexibility to 
write the rules of the game that they wish to 
play. ; 

Such asymmetry can be readily justified 
theoretically. For example, simply by allow- 
ing for production over time, various asym- 
metric equilibria may arise. Garth Saloner 
(1987) demonstrates this by allowing for two 
production periods before the market clears 
in the standard Cournot duopoly analysis 
with identical firms. He shows that all out- 
put combinations on the outer envelope of 
the best response functions from one Stack- 
elberg solution to the other are sustainable 
as subgame perfect Nash equilibria. In this 
paper, I take as given that firms may find 
asymmetry advantageous and proceed to 
employ a model of identical firms that al- 
lows, in a very simple manner, for both 
symmetric and asymmetric equilibria. 
Specifically, a simple n-firm, static, quantity- 
setting oligopoly model, with the number of 
leaders as a parameter, will be posed and 
examined. Welfare (in the case at hand, 
measured by aggregate output), average 
profits, and concentration (measured by the 
Herfindahl index) can thus all be expressed 
as functions of the number of firms and the 
number of leaders, allowing for compar- 
isons of interest. 

To formalize the above, I first specify a 
static model of an industry comprised of 
n identical firms, each producing a homo- 
geneous product at a constant marginal 
cost of c; firm i’s output is denoted x,. For 
convenience, let the inverse demand func- 
tion which specifies price as a function of 
aggregate output be p=a—bL;x;, where 
a>0O, b> 0. Firm i’s profit is II‘(x)=(a—c 
— b} ;x;)x; where x=[x,,...,x,]' is the 
vector of firm outputs. 

A cautionary note is called for at this 
point. The foregoing setup is clearly quite 
special, especially with respect to the cost 
structure, since average costs are constant. 
It is straightforward, but notationally and 
algebraically unpleasant, to allow for declin- 
ing average costs by (say) incorporating a 
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fixed cost in the cost function. If such costs 
are large, then the adjustments to industry 
structure that will be examined might in- 
duce exit, and thus the results would change. 
Technically, this is due to discontinuities in 
the optimal response functions for the firms 
(see Avinash Dixit [1979] for a discussion of 
this point). For moderate-to-small costs, no 
results will change. Since the main issue is 
to examine how industry structure can affect 
standard notions of the relationship be- 
tween concentration and welfare, the above 
model (even though it is not particularly 
realistic) is sufficient. Moreover, since em- 
ploying constant average costs is likely to 
underestimate (vis-a-vis decreasing costs) 
the welfare benefits of asymmetry, the anal- 
ysis to follow errs, if at all, by being too 
conservative. 

Since the main interest of this paper con- 
cerns industries of moderate size, I will not 
consider either collusive or perfectly com- 
petitive equilibria. Traditionally, two nonco- 
operative oligopoly solutions have been em- 
ployed to predict equilibrium output levels, 
namely Cournot and Stackelberg (see James 
W. Friedman [1977] and Daughety [1988] 
for detailed discussions of these solutions). 
An equilibrium is a Cournot (Nash) equilib- 
rium if: 1) each firm chooses its output to be 
a best response (i.e., profit-maximizing re- 
sponse) to conjectured outputs of all other 
firms; and 2) the conjectures are correct for 
all firms. In the case of identical firms, the 
Cournot equilibrium is symmetric: all firms 
produce the same level of output and re- 
ceive the same profit. 

In the Stackelberg equilibrium, one firm 
is a “leader” and n —1 firms are followers. 
The followers “play Cournot” by computing 
best-response output levels to the aggregate 
of all others’ levels. The leader recognizes 
this and uses the followers’ best-response 
functions to decide on a profit-maximizing 
output level. Thus, the leader’s output level 
is not a best response to the followers’ out- 
put levels; it is instead a best response to 
the followers’ best-response output functions. 

I will now extend this to allow for m 
leaders and n — m followers (see Daughety. 
1984; Hanif D. Sherali, 1984). Let the n ~ m 
followers “play Cournot,” taking the aggre- 
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- gate of all other folfowers’ output and the 
m leaders’ output a3 given. Moreover, let 
the leaders recognize this (i.e., use the fol- 
lowers’ best-response functions), but let each 
leader play Cournct against each other 
leader, realizing that all leaders understand 
this and are also chcosing output in a simi- 
lar manner. As will be seen below, this 
parameterization provides for a wide range 
of potential market structures. 

Let x} denote a typical leader’s output, 
X" denote aggregate leader output, x" de- 
note a typical followzr’s output, X® denote 
aggregate follower cutput, and XT denote 
aggregate industry Dutput. Then, the m- 
leader equilibrium solutions are as follows: 


x'(m,n) = 


[(a—e)/b](m +1) 
X*(m,n) =m|(a—c)/b] /(m+1) 


x*(m,n) =[1/(n—m+1)]x* 
_ (a-c)/b 
~ (m41)(n—m+1) 
: _ (n-w){(a—c) 7b] 
Gnd) 
XT(m,n) =[|(n+rm-m)/(n-m+1)] 


X[(a—c)/b]/(m +1). 


Note that both m =0 and m =n corre- 
spond to a Courno: industry: when m = 0, 
all firms are followers and play Cournot; 
when m= n, all,firms are leaders and play 
Cournot. Clearly, 2 = 1 corresponds to the 
standard Stackelberg model. It is tedious 
but straightforward to show the following 
(proofs are availabE from the author upon 
request).! 


‘For expository convenience, m and n are being 
manipulated as if they were continuous variables. This 
is valid as a procedure,-2 long as appropriate rounding 
is employed as necesszry. This holds since all the 
functions are either mouctonic or unimodal, and thus 
(in one dimension) integer solutions can be directly 
constructed from the coatinuous solution. 
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(1) For fixed n, XT is concave in m 
(0°X T/m? <0). 

(2) For fixed n, average firm profit, Il(m, n) 
(= £I n), is convex in m. 


Both results are intuitively reasonable. X7 
concave in m extends the standard result 
that the Stackelberg aggregate output is 
greater than the Cournot aggregate output. 
Since X1(0,n)= XT(n,n) and XT(1,n)> 
XT(0,n), one would expect XT to rise and 
then fall. This would also suggest that 
IlOn, n) would first fall and then rise. These 
two results figure significantly in what fol- 
lows. 


A. Welfare, Concentration, and Asymmetry 


Under the assumptions used, above, in- 
creases in aggregate output, XT, result in 
increases in welfare (measured by surplus), 
and decreases in aggregate output resultin 
decreases in welfare. Thus, welfare attains 
an interior maximum when XT does, which 
is when the optimal* number of leaders, 
m*, equals n /2. 

Most importantly, welfare is, maximized 
when there is considerable asymmetry: sym- 
metric equilibria (m = 0 or. m = n) are wel- 
fare-minimizing (for.fixed n). In fact, for the 
type of asymmetry I am considering, any 
degree of asymmetry is socially preferable 
to symmetry. Again, this follows from the 
fact that XT is concave in m and is sym- 
metric about m* 

One immediate further implication is that 
measures of concentration may have little, if 
anything, to do with indicating welfare. This 
is. particularly easy to see with the Herfin- 
dahl index. The Herfindahl index is the sum 
of the squares of market shares of firms in 
the industry. George J. Stigler (1964) pro- 
vides a theoretical basis for its use as a 
measure of concentration. To examine this 


*More precisely m* =n /2 when n is even. When n 
is odd m* is either (n - —1)/2 or (n+1)/2. For exam- 
ple, if n = 10, then m* = 5, while if n = 11, then, m* = 5 
or 6. Also, the use of XT to measure welfare assumes 
that there are no ex ante differences between the 
equilibria for, say, m’ and m"> m'. 
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index in the setting herein, let S, denote 
the aggregate output share of leader firms. 
Then, the Herfindahl index, H, is 


S? /m+(1- S1) /(n- m) 
H= forl<m<n-l 


1/n for m=Oor n. 


Again, treating m as a continuous variable, 
it can be shown that 


>0 msi 
aH /am\ Z6 m> 1. 


Thus, H rises from m=0 to m=1I1 and 
then declines as m grows. While the algebra 
for this result is very messy, the intuition is 
reasonably straightforward. Clearly H is 
equal for m =Q and for m= n, since these 
values produce symmetric industries; more- 
over, one would therefore expect H to rise 
and then fall as m ranges from zero to n, 
since symmetry minimizes H (for fixed n). 
In the situation with precisely one leader, 
there is one firm that has considerably 
greater share than any other firm in the 
industry. Additional leaders means that the 
aggregate leaders’ share rises, but each indi- 
vidual leader’s share does too. Therefore, 
when m>2, there is no one firm with 
greater share than any other firm. Thus, one 
would expect H to peak at m=1, which it 
does. 

In general, therefore, H is declining both 
over ranges of m when welfare is increasing 
and over ranges of m when welfare is de- 
creasing: decreasing concentration does not 
imply increasing welfare. A similar result 
obtains if a concentration ratio measure is 
used. For example, if the four-firm ratio is 
used, then this ratio increases as m in- 
creases up to m = 4, and then it falls mono- 
tonically as m continues to increase. This 
suggests that, in and of themselves, such 
measures do not reveal what one would 
most like to know: when competition has 
increased and welfare has improved.’ 


3This is not to say that one could not construct an 
index that would reflect welfare considerations. Such 
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B. Profits and Concentration 


In study after study, a positive correlation 
between average firm profit and various 
measures of concentration has been found 
(see Leonard W. Weiss, 1974; Frederick M. 
Scherer, 1980). What this correlation means 
is less obvious (e.g., Sam Peltzman, 1977; 
Scherer, 1980). The foregoing analysis indi- 
cates that profits and concentration mea- 
sure are positively correlated when the 
number of leaders is less than m*, the so- 
cially optimal number. To see this, recall 
that [[(m,n) is average firm profit. This can 
be readily shown to be 


L (a~cY(n+nm- m?) 
M a r YORE Va 


As indicated earlier, IOn, n) is convex in 
m, and its minimum occurs at m* = n /2. 
Therefore, for m*>1, H(m,n) declines 
while H declines for m < m*, and Il(m,n) 
rises as H declines for m >m*. In other 
words, for typical industries (7 > 2), there is 
a positive correlation between average firm 
profits and concentration when there are 
too few leaders and a negative correlation 
when there are too many. Figure 1 illus- 
trates the relationships among XT, II, 
and H. 

Thus, the empirical observation that there 
is a robust positive correlation between av- 
erage firm profits and industry concentra- 
tion may simply mean that the ratio of lead- 
ers to firms in the typical industry in the 
sample is low (i.e., not socially optimal), 
since actual m appears to be to the left 
of m*. 


C. Mergers and Welfare 


Up to this point, n has been held fixed. 
While large changes in n (e.g., n becoming 
infinite) have the usual results, it is small 


an index, however, cannot rely upon information on 
market share alone, as is evident from the foregoing 
material and from the discussion of the industry per- 
formance gradient index in Robert E. Dansby and 
Robert D. Willig (1979). 
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changes in n that are now of interest. For 
example, what if two firms merge? As will 
be shown below, this need not mean that 
welfare has fallen. As pointed out earlier, 
there is no cost advantage to a merger here. 


Thus, any advantages or disadvantages are: 


purely due to the noncooperative equilib- 
rium itself. 

There are several recent papers that have 
examined the strategic effecis of mergers. 
Stephen W. Salant et al. (1983) examine a 
quantity-setting oligopoly and look at the 
conditions under which merger may be dis- 
advantageous to the participants. This can 
occur since, by merging-and restricting out- 
put,. the residual: firms in the. industry 
can expand, with the net result being that 
the merger is disadvantageous for the par- 
ticipants. Raymond Deneckere and Carl 
Davidson (1985) demonstrate how this re- 
sult can be sensitive to the assumption of 
strategic variable employed, by employing a 
model in prices and getting-opposite results. 
Morton Kamien and Israel Zang (1987) em- 
ploy a model of a suboptimizing holding 
company in a quantity-choosing context that 
reestablishes a variety of conditions wherein 
mergér can be advantageous. Most recently, 
Joseph Farrell and Carl Shapiro (1990) use 


a Cournot model to analyze mergers and. 


also observe that mergers can be welfare- 
enhancing. They find that if the merger 
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generates “synergies” (e.g., if the partici- 
pants in the merger can recombine their 
assets so as to increase their joint produc- 
tion possibilities), then welfare: can be en- 
hanced: by a merger. As will be seen below, 
welfare can be enhanced by a merger in 
spite of the lack of such synergies, if the 
merger alters the behavior of the partici- 
pants. 

In this section, I wish to examine a spe- 
cial type of merger (or dismemberment) 
wherein, say, two followers merge and the 
result is a firm that behaviorally is a leader. 
Figure ‘2 illustrates X (m,n) for industries 
with n=n, and n=n,+1 (say 12 and 13) 
firms. Recalling that equilibrium total out- 
put for an v-firm industry with m leaders is 


(a-c)(n+nm— m?) 


ar a +1)(n—m+1)b 


it is straightforward to show that 


<0 n<2m 

aXT/am)\ =0 n=2m 

>0 n>2m 
axXx™/dn>0 


<0 3m+1l<n 
aX /amdn' =0 3mt+1=n 
>0 3m+1>n,. 
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As indicated in the figure, changes in n 
alone have the usual interpretation. Thus, 
for example, a bankruptcy accompanied by 
no entry and no change in industry struc- 
ture is welfare-impairing. The cross-partial, 
however, is most interesting: could changes 
in industry structure result in a welfare im- 
provement? What if a merger reduced the 
number of firms but increased m? Clearly, 
this is a very special type of merger, such as 
was briefly discussed above. 

More precisely when is XT(m +1,n —1) 
greater than XT(m,n)? A little algebraic 
manipulation reveals that this occurs when 
3(m +1) < n. Thus, leader-generating merg- 
ers in industries that are very close to sym- 
metric (and low in terms of the number of 
leaders) can be socially desirable. The in- 
crease in asymmetry again results In an in- 
crease in competition, resulting in greater 
aggregate output and greater welfare. Of 
course, as the derivative with respect to n 
shows, mergers that simply reduce the num- 
ber of followers without increasing the num- 
ber of leaders reduce welfare. 

Of course, such mergers should not occur 
unless the new entity is more profitable 
than the individual parts. Thus, when is 
Hm +1,n —1)>21iF(m,n) Ge., when 
does the postmerger profit of the generated 
leader exceed the premerger profits of the 
two followers)? This turns out to occur if 
and only if fQn,n)=(n—-—m+1)*(m +1)" 
—2(n — m —1Xm +2} >0 for the pre- 
merger industry with n firms and m lead- 
ers, a very unpleasant condition. However, a 
grid search over all possible values of n and 
m such that 3(m +1)< n yields a very nice 
result: for n <28, if a leader-generating 
merger is welfare-enhancing, it is profit- 
enhancing! This contrasts with the existing 
literature on merger analyses performed 
with static quantity models; Salant et al. 
(1983) show conditions wherein using a 
Cournot oligopoly model to examine merg- 


“alternatively, when 3m—1<n, then X¥%m,n)> 
Xm —1,n +1); that is, breaking up a leader firm into 
two followers reduces welfare, even though the total 
number of firms has increased. 

When n> 28, the quartic properties of f(m,n) 
make characterization difficult. 
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ers can imply profit losses from mergers for 
the participants, due to equilibrium output 
adjustments. In general, the size of merger 
that is considered here would show losses 
under a Cournot model. 

Thus, when m is small compared to n 
(i.e., somewhat less than one-third of the 
firms are leaders), mergers that are leader- 
generating can be both privately and so- 
cially advantageous. However, this is not to 
say that all mergers that increase the num- 
ber of leaders are welfare-enhancing (or 
that all divestments are welfare-impairing®). 
As indicated earlier, too many chefs can 
spoil the broth: 2X" /dm is negative when 
m>n/2. In this case (in equilibrium), dis- 
memberment of a firm which would de- 
crease the stock of leaders and increase the 
stock of followers would be socially benefi- 
cial. The main point is, however, that merg- 
ers can be welfare-enhancing, even without 
cost advantages or other traditional consid- 
erations. 


IL. Conclusions 


The point of this paper is very simple: 
asymmetry, and thus concentration, can be 
beneficial from society’s viewpoint. This 
benefit does not depend upon scale econ- 
omies or marketing advantages or learning- 
by-doing. It derives entirely from the nonco- 
operative nature of firm interaction. Alter- 
natively put, asymmetric organization can 
be socially as well as individually optimal. 
Thus, focusing on concentration as deleteri- 
ous is misguided: actions that reduce the 
number of firms or increase concentration 
can be welfare-enhancing. 

The fundamental attribute that con- 
tributed to this result was rational asymme- 
try in firm strategic behavior. Firms adopt 
roles in industries; as indicated earlier, not 
all firms in an industry find it beneficial to 
operate market-forecasting groups or to en- 


°Welfare-enhancing dismemberment of a leader that 
produces two followers simply reverses the stated con- 
dition. From the derivative condition, mergers of two 
followers into a follower always lower welfare. Since 
X"™(m,n)> XT(m—1,n-—1), merger of two leaders 
into a leader always lowers welfare. 
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page in R&D, especially if other firms in the 
industry do such thirgs. The Nash equilibria 
that support such outcomes can involve 
greater aggregate output than those associ- 
ated with symmetric roles. That such indi- 
-vidually rational behavior can be socially 
advantageous is stmo>ly one more reason for 
avoiding single-mincled adherence to the 
siren song of symmetry, both in analysis and 
(especially) in policy. 
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Horizontal Mergers: The 50-Percent Benchmark 


By Dan Levin* 


The paradigm often used in the study 
of industrial organization relates market 
structure fo conduct and - performance 
(e.g., F. M. Scherer, 1980 pp. 4-5). Mea- 
‘sures such as concentration ratios and the 
Herfindahl-Hirschman index provide partial 
characterization of structure by capturing 
the number and size distribution of firms. 

Horizontal merger among a subset of 
firms in the same market may reduce com- 
petition by reducing the number of firms 
and increasing concentration. If the merged 
firm can lower costs by reallocating produc- 
tion, the incentive to merge is reinforced. 
The concern is that such a change in struc- 
ture increases market power and adversely 
affects market performance. Many econo- 
mists dismiss or are skeptical about favor- 
able effects of horizontal mergers. Scherer 
(1980 p. 546) concludes: “an impressive ac- 
cumulation of evidence points to the con- 
clusion that mergers seldom yield substan- 
tial cost savings, real or pecuniary.” 

Such views are the main rationale for 
antitrust policy in this area. Section 7 of the 
Clayton Act was designed to thwart 
monopoly power “in its incipiency”; it 
speaks in terms of a merger that “may be 
substantially to lessen competition or to tend 
to create a monopoly.” Yet, antitrust policy 
toward horizontal mergers is evolving. The 
Department of Justice adopted new guide- 
lines in 1982 with modest changes in 1984. 
Further proposals were made during the 
Reagan Administration. 

The four papers in the recent Symposium 
on “Horizontal Mergers and Antitrust” in 
The Journal of Economic Perspectives (Fall 
1987), clearly demonstrate the complexity of 
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the issues involved in horizontal mergers: 
market definition, market concentration, 
ease of entry, and efficiencies. I abstract 
here from issues of market definition or 
entry and focus on the effect such mergers 
have on performance and welfare vis-a-vis 
their impact on concentration, market 
power, and conduct. In other words, this 
paper contributes to the debate described 
so nicely by Lawrence White (1987 p. 14), 
who poses it in terms of the following two 
hypotheses: is it the case that “the more 
easily a group of sellers (who collectively 
might have market power) can coordinate 
and police their mutual actions, the more 
likely are they to approximate a monopoly 
outcome”; or is it the case that “it only 
takes two to make a horse race”? 

I analyze the consequences of a horizon- 
tal merger by a subset of firms tn an indus- 
try, assuming that firms that are not part of 
the merger behave à la Cournot. My analy- 
sis is related to that of Stephen Salant, 
Sheldon Switzer, and Robert Reynolds 
(1983) (SSR hereafter), who study the con- 
sequences of a horizontal merger by a sub- 
set of firms in a Cournot industry. Unlike 
SSR, I do not restrict the merged group to 
remain a Cournot player after the merger.’ 
Thus, the merged group can become a 
Stackelberg leader, a conjectural variation 
player, or remain Cournot.” I impose only 


ISSR assume that the merged group behaves like a 
multiplant player who engages in a noncooperative 
game against other firms after the merger. In the case 
of symmetric, identical constant marginal cost, this 
assumption implies that, no matter how many firms 
merge, their total postmerger output will be the same 
as that of one single firm that stays out. I find this 
implication to be restrictive. 

2The literature in industrial organization recognizes 
the possible need to assign different perceptions and 
modes of behavior to separate firms whose sizes or cost 
structures are asymmetrical. See Scherer (1980 pp. 176, 
232-33) for a discussion af the dominant firm model 
and Hal Varian (1984 pp. 101-3) for a discussion of 
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stability conditions end proceed to study the 
implications of suct horizontal mergers on 
profits and welfare." 

I find that, if a group of firms with less 
than 50 percent of market output considers 
a horizontal merger then a) any contraction 
of output by the merged group will cut 
profits below the level obtained by only real- 
locating their premerger output* and b) any 
profitable merger will raise welfare. These 
results may have strong implications for 
public policy in tnis area. With market 
boundaries defined: 


The [new (1982] Guidelines use the 
Herfindah!-Hirschman Index (HHI) as 
their primary market concentration 
guide, with comcentration levels of 
1000 and 1800 as their two key levels. 
Any merger in a market with a post- 
merger HHI belsw 1000 is unlikely to 
be challenged; = merger in a market 
with a postmerger HHI above 1800 is 
likely to be challenged (if the merger 
partners have market shares that cause 
the HHI to increase by more than 
100), unless oter mitigating circum- 
stances exist, like easy entry. Mergers 
in markets wih _ post-concentration 
HHI levels between 1000 and 1800 
require further analysis before a deci- 
sion is made whether to challenge” 

[White, 1987 p. 16]. 


Taken at face valie, my analysis suggests 
that a two-firm merger should not be chal- 
lenged when their premerger market share 


the Stackelberg lead=r and conjectural—variations 
models. An overview 3f recent developments in the 
theory of horizontal mergers is in Gerard Gaudet and 
Salant (1989). 

3 Welfare analysis is absent from 5SR’s work. 

“SSR show that it is possible and even plausible that 
such a horizontal merger, within a Cournot industry, is 
not profitable. Their linear-demand example shows 
that, if the subset of firms that merged has less than 80 
percent of premerger output, then their total profits 
will be lower. One ma” ask how sensitive this market- 
share mark is to the linear-demand and identical- 
marginal-cost assumptons. If by dropping linearity a 
small percentage of mazket output by the merged group 
may assure an increase in the group profits and reduc- 
tion in welfare, then he SSR demonstration, though 
interesting, has little bearing on policy issues. 
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is less than 50 percent, which is assured 
when the HHI is less than 1250. Some limi- 
tations of the analysis are discussed in the 
summary section. 


I. The Basic Model 


Let there be n > 1 Cournot firms that sell 
their homogeneous output at the market 
price, P. The market (inverse) demand 
function of total industry output, Q, is P(Q) 
with P’<0 where P > 0 and is assumed to 
be twice contmuously differentiable. I also 
assume the following. 


ASSUMPTION 1 (A1): Firms have cost 
functions of the form Cq;)=¢;q;, where q; 
> 0 is the output of the ith firm. 


Let ¢ = min,(c;). Clearly with Al we must 
have P(Q)>c in order for firms ever to 
produce positive outputs. Define output 
level Q by P(Q)=c. 

Different marginal cost provides addi- 
tional incentive to merge. To focus on in- 
centives for horizontal merger that relate 
strictly to market power, concentration, and 
conduct, it will sometimes be useful to ab- 
stract from efficiency considerations and as- 
sume, as in SSR, the following. 


ASSUMPTION 1* (A1*): All firms have 
identical cost functions, C(q;) = cq;. 


Next, I impose the following restrictions on 
the demand function. 


ASSUMPTION 2 (A2): P'(Q)+ P’(Q)q; < 
0 for all 0< q; < Q, as long as P > 0. 


Frank Hahn (1962) makes and defends 
A2. It requires that, at all possible outputs, 
the marginal revenue of any one producer 
with a given output is a diminishing func- 
tion of total output of his rivals. Roy Ruffin 
(1971) assumes A2, demonstrates its reason- 
ableness, and suggests an alternative inter- 
pretation, namely that A2 requires that, at 
all possible outputs, the marginal revenue 
function facing any firm is steeper than the 
demand function. A2, together with another 
assumption that is trivially satisfied by A1, 


1240 THE AMERICAN ECONOMIC REVIEW 


has been shown to assure the stability of 
Cournot oligopoly.” At the very least, A2 is 
an important extension of the example in 
SSR which assumes that P” is identically 
zero. 

It is well known that Al and A2 are 
sufficient for the existence of a unique 
Cournot-Nash Equilibrium (CNE). The 
unique CNE is completely characterized by 
the first-order conditions for profit maxi- 
mization of n firms: 


(1) P(Q)+P'(Q)a;-¢,=0, 
Summing over 7 and rearranging yields 


en 


I=1,...,7. 


(2) MO) = 

i=] n 
Denote by Q* total industry output at the 
CNE with n firms, by g;* >0 each firm’s 
output, by P*= P(Q*) the market price, 
and by m* = q*(P* — c;) each firm’s profits. 


IJ. Horizontal Mergers: Incentives and 
Welfare Implications . 


A subset of m firms, 2 < m < n, is consid- 
ering a merger. Denote this set of firms by 
M, their total output by Qm, and their 
premerger total output by O% =L ema. 
Denote by F the fringe of k =n —™m firms 
that stay out of the merger and by Qp, F's 
total output. 


ASSUMPTION 3 (A3): The k members of 
F remain Cournot firms after a merger. 


This assumption is made in SSR, and though 
it is potentially restrictive, I maintain it here 


‘Hahn (1962) shows that A2 together with the as- 
sumption P’--C/ <0 for all firms assures stability of 
Cournot oligopoly. A recent work by A. Al-Nowaihi 
and P. L. Levine (1985) shows that Hahn’s conditions 
assure only local stability. It is global only if the num- 
ber of firms is less than five. 

Levin (1982) provides a direct proof that allows any 
continuous and twice-differentiable -C,(q;), as long as 

— C/'(q;) < 0, which is trivially satisfied by A1. This 
result can be shown to be a special case of a much 
more general proof in B. Rosen (1965). 


DECEMBER 1990 


to simplify the analysis. However, unlike 
SSR, which requires that M remain a 
Cournot firm as well, I impose no restric- 
tions except cost minimization on M’s mode 
of behavior. In other words, though I do not 
model the mechanism. that determines M’s 
behavior after the merger, I do allow the 
possibility that M may become.a Stackel- 
berg leader, a “conjectural variation” firm, 
stay Cournot as in SSR, or anything else. 

-Clearly Oy < Q. Under Al, A2, and A3, 
Levin (1982) shows that for any given Q M<. 
Q there is a unique CNE for firms in F 
satisfying for all i € F 


(3) Pwt OD +P Ont OPa- —c;s0 


with strict inequality only if q; = 0. Denote 
by qQ) and Q (Qm). the. output of the 
ith firm in F and the total output of F when 
the output of M is Quy. 


LEMMA 1: Any 0<O),<Ou<Q2 im- 
plies Or, + O(Qm) < Qm ki O(Ou) and 


q(QM) = > @(Q,,) for alli EF. 


PROOF: | 

Suppose Q%,<Q,, but that Q; + 
OO?) > Om + QO). This implies, due 
to A2 and (3), that P(Q}, + Q(Q9,) + 

P(Q? + OKOD c< 0 for all 
i € F. Since P’ <0, it implies that qi (Om) < 
q;(Qm) for all IEF., so that 00%) < 
QrlOm). Thus, Qu + OrCOW) < Om + 
QO(Q,y,), which leads to a contradiction. 
Hence Q? < Qum implies QO}, + Or(Q?) < 
Qm + QF(Qyu). The last inequality implies, 
due to A2 and (3), that for all i €F, 
POy + OnQy)) + PO, + Qu) X 
q(Qh) — c; <0. Thus, q;(QM) = qfQu) ` 
for all 1 € F, establishing the proof. . 


The uniqueness of the CNE for F implies 
that, if Qm = Os, a(Cs) = që for all i €F, 
Thus, in such a case, total industry output 
and price remain as in the Premerser equi- 
librium. 

Distinguish between Qu < OX and On 
> QO*,, namely, M contracts or expands out- 
put relative to its premerger level. ` 
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Case 1: Qm < Qğ (contraction). From 
Lemma 1, for all (EF, ¢{Q,)2>9* >0. 
Thus, as long as Q,;<QO%,, k firms with 
strictly positive outpu-s remain in F, so that 
(3) holds with equal-ty. Summing over all 
i €F in (3) and rearranging yields 


(4) P(Qy + Qs) 


P( Qu + Qr)Q 
pP Out OO a o 
where Cp = Lie pC; / x, the average marginal 


cost of firms in F. 

In the absence of fixed costs, (4) defines 
Q,, and thus each q, i € F, as a continuous 
function depending enly on Qu and Ep but 
not on the distribttion of the c,’s. This 
property, discussed n Theodore Bergstrom 
and Hal Varian (19&5a,b), is very useful. It 
implies (since Zp remains constant) that all 
variables of the model can be expressed in 
terms of Q,,. Differentiating (4) with re- 
spect to Qy yields 


_UOu+ Gr) _ P’ 
dOu  kP'+P'+ PQ, 


B is the response ot total industry output to 
an increase in the cutput of M. Since Qp < 
Q, P'+ P"Q, < max[P’, P’+ P’Q] < 0, 
where the last inequality is due to A2. Thus, 
using (5), I conclude that 


(5) 


(6) 0<kB <1. 


Case 2: Qm > GC, (expansion). From 
Lemma 1, for all ieF, qkOuv) <4", yet 
Qm + O(Om) > CM t 0(O%): that is, to- 
tal industry output increases when M ex- 
pands its output. 


A. Profitability of Horizontal Mergers 


Define a merger as profitable for a subset 
m of firms if the postmerger profits of M 
exceed the sum ol profits the m firms in M 
have without a merger. I consider first a 
contraction by tke merged firms; that is, 
Ou < Qs, (Case .). Let c,, be the lowest 
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marginal cost of any firm in M, this is ¢y = 
min(c,), j E M, and let R be the cost saving 
to M as a result of producing the premerger 
output, QÑ, but in the lowest-marginal-cost 
plant(s). This is R= Lj eye; — cy dg} 2 0. 


THEOREM 1: Assume Al, A2, A3, and 
Cu < Ĉp, and consider a merger by any subset 
m. of firms that have no more than 50 percent 
of the premerger market output. Any contrac- 
tion of output by the merged firm will reduce 
profit below the level obtained by simply real- 
locating their premerger output. 


PROOF: 

To minimize the cost of M, production 
must take place only at plant(s) with 
marginal cost c,,. Thus, actual profits for M 
are Ty = Onl PCy + OFM) Cyl. On 
<Q (the 50-percent assumption) and 
dQ; /dQy = B —1 <0 from (6). Hence, any 
Em <Qň implies Q(Qy) > QÑ = Qum 

en, 


dry /dQy 
= P(Qu + Qe Qu) ~ Ep + Ep — cm 
+ P'(On + Or(Ou)) QuB 
= — P(Qu + On(Ou)On(Qy)/k 


+ P'(Oum + Or(Qu))QuB + Cr — Cy 
=(— P'/K)Q(Qu) ~ OukBI + Tr- Cx 
> (~ P'/k)LOr(Qu) ~ Quit Er- Cy 
=Crp—Cy >= 0. 


The first equality is due to (5), the second 
equality is due to (4), the strict inequality is 
due to kB <1 from (6) and the fact that 
(— P’/k)>0, the first inequality is due to 
OOy)— Qu = 0, and the last inequality is 
by assumption. Since diary /dQy, > 0 at any 
Oy, = OX, Om <Q, implies that my < 
TM = ONC P* — cy).? 


"The result of Theorem 1 can be extended to cost 
functions of the form C,(¢;) = c;q; +(1/2)dq? as long 
as d <0 and the second stability assumption discussed 
in footnote 5 is satisfied ( p’— d <0). 
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It is possible that a merger such as in 
Theorem 1 is profitable: with contraction 
when R> 0. However, contraction cannot 
be profitable if one eliminates reallocation 
of output as an incentive for such merger, 
as in the following corollary. 


COROLLARY: Assume :A1*, A2, and A3, 
and consider a merger as in Theorem 1. Any 
contraction of output by the merged firm will 
cut profit (ie, Ty < TH). 


PROOF: 

Here, cy =C = Cp, R= 0. and the rest of 
the proof mimics the proof to Theorem 1. 
One concludes that Q,, < QÑ implies mm 
< STM 


B. Horizontal Mergers and Welfare 


I use the sum`of consumer and producer 
surplus, ignoring income effects, as a wel- 
fare measure denoted by W. Let W(Q,,) be 
the postmerger level of welfare given that M 
produces Qu, and let AW(Q,, > Q2,) mea- 
sure the change in-welfare as a result of M 
increasing its output from Q}, to Qm. This 
is 


(7) AW(Qy = Q%) 
= W( Qu) i wW(Q9) 


Z aiii d 4) dt 
OY + OOR) 
~ YcfalQu)-4(2%)] 
iEF 
—¢y(Om 7 09). 


Let cp be the lowest marginal cost of any 
firm in F, this is cp = min(c;), i € F. 


THEOREM 2: Under Al, A2, A3, and cy 
> Cy, as long as the profit of the merged firm 
remains positive, any increase in its output 
raises welfare. 
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PROOF: .- 

By Lemma 1, On OM implies Q = Qm 
+ QrOu) = o9, + O(n) = Q°. There- 
fore S&P) dt > POO — Q°). Since in 
this case g{Qy,) < qf Ons) for all į € F, then 

Lier’ lalu) — aQ% < cel OO) — 
0(0%)). Thus, 


AW(Qu = O%) 
= P(Q)(Q- 9°) 
- ¢r[Or( Qu) — Qr(Q%)| 
~ cu (Qu — 0%) 
> P(Q)(Q- Q°)—cy(Q-Q°) 
= (P(Q) - cu)(Q -Q°) = 0. 


The second inequality is due to Cr = Cy, and 
the fact that O-(Qy)— OQ%,) <0, and 
the last inequality is due to the fact that 
both terms in the product are nonnegative. 


The intuition of this result is simple. When 
M increases its output, F contracts its out- 
put, but by a smaller amount. Hence, one 
can decompose such a change into two dis- 
tinct parts: a pure reallocation of produc- 
tion from F to M and a pure increase in the 
total output of this industry. The net in- 
crease in output is welfare-enhancing, since 
price is above marginal cost. The realloca- 
tion of output can adversely affect welfare. 
The assumption, cp > Cy, assures that the 
reallocation part is also welfare- -enhancing.® 

Often welfare increases with the output of 
M under weaker conditions. Note that this 


ŝWith linear demand, the weaker Cp > Cy assures a 
favorable reallocation affect as a result of an increase 
in Qm. However, when Qm increases beyond Q}, the 
number of firms in F with positive output may fall, and 
Če will get smaller as a result or high-marginal-cost 
firms dropping out. 
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result is independent of the market share of 
M. Also note that under Al*, cp = Cym: SO 
welfare increases witn the output of M. 

Let AW(Q,,) measure the change in wel- 
fare as a result of a merger M that produces 
Qm. Clearly AW(Q),.) = WO- WOH) + 
R. If the reallocation of output (efficiency) 
incentives for merge- are removed (i.e., R = 
0), one obtains the DHllowing 


THEOREM 3: Uncer.A1*, A2, and A3, 
any profitable merger by any subset m of firms 
that have no more than 50 percent of the 
premerger market ouput will raise welfare. 


PROOF: 

Under these conditions, the corollary to 
Theorem 1 shows that a profitable merger 
implies Qm > QÑ. This implies, by Theo- 
rem 2, that welfare: will increase as long as 
Cp — Cy = 0; but under Al*, Cp =€ = Cy. 


Could Theorem 3.be extended to the 
weaker Al rather ihan A1*? The point is 
that under Al there are possible realloca- 
tion gains to M (R> 0), so it is possible to 
end up with profitanle mergers where Qy < 
Qğ. Are the efficiency gains, R, sufficient to 
compensate for such possible contraction in 
Q,, and in total cutput Q? The answer is 
yes, if one assumes P" <0 rather than A2. 


THEOREM 4: Under A1, P” <Q, A3, and 
Cp — Cy = 0, a profitable merger by any subset 
m of firms that haiz no more ihan 50 percent 
of the premerger market output will raise wel- 
fare. . 


PROOF: 

By Theorem 2, 1 need to show the above 
only for Q}, < Qž (contraction) and, in light 
of Theorem 1, ony for R>0. Since Q? < 
O*,, it is known that (4), (5), and (6) hold 
and that (3) holds with equality. 

Let AOM =Q- f(Q%&) and de- 
note consumer surplus by CS. AW(Q?,) 

= ACS(Q?,) + Lirr(On ) + ATOM ) 2 
ACS(OS, + Am 60%); since Aryl) > 0 
by presumption. LCS(QM) > — a” P*)Q* 
(i.e., the negative of the increase in market 
price times the larger quantity Q* >Q’). 
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Then, 


Arp(Q2) = E Snt) dt 
iEF' OÑ 


=E [ai [P(t +2r()- c] 


ieF 


+ q,(t)P'(t + Q,(t))B} at 


- {2 P+ Qe0)6 


x) Or) + C= ada: /B|d 


ieF 





by using (3) where m? = dlq{QyXP(O,y, + 
ONOn)) — c;)]/dQm and where q/ = 

G6 Qny)/ dQ = — (P' + P'G)/(KP’ +P’ + 
P"Q,) <0, for all iG F, which is obtained 
by differentiating (3). However, under P” < 
0, —q/ and q; are positively correlated; 
thus, Ys e-(— aa /P = (Op /kK)A— B)/B 
and also 1—8)/B>k [under P’ <0, (5) 
implies that 0<(k +1)8 <1], implying that 
Lier(~ 4i)q;/P = Qr. Thus, 


Are(O) = f7- P(t + Ox(t)) B20¢(t) di 


2 orf - P'(t+Q,(t)) Bat 


=(P- Pr). 


The first inequality is due to Q% > Q?, and 
— P'>0Q. The second inequality is due to 
20,A(t) >= 2Q# = Q* because of both the fact 
that Q, expands as Q,, contracts and the 
50-percent assumption. The conclusion is 
that AW(Q?,)>0. This establishes the 
proof. 


Profitability of the merger is measured in 
terms of actual profits of M rather than 
perceived profits. If M becomes a Stackel- 
berg leader, the difference. disappears since 
such M correctly anticipates the response of 
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F.’ In other cases, unprofitable Q,, are not 
stable in the sense that M can dissolve itself 
or imitate what it did before the merger. 


C. A Few Additional Observations 


1. The 50-percent mark in the theorems is 
not necessary. Simple computations show 
that, if demand is linear and if marginal 
costs are the same, a merger of two firms in 
a three-firm market is either unprofitable or 
welfare-enhancing. 

2. When M remains Cournot after the 
merger, it can be shown (using M’s first- 
order condition for maximization and A2) 
that output will not increase (see Joseph 
Farrell and Carl Shapiro [1990] for this re- 
sult with a more general cost structure). 
Thus, Theorem 2 will not apply, but Theo- 


rem 4 will: namely, if conditions of Theo- 


rem 4 are satisfied, profitable mergers are 
welfare-enhancing even if output falls. 

3. Though Qu > Q implies W > W*, not 
every expansion in Q,, is profitable. A very 
large Qm so that P is close enough to cy, 
is clearly not profitable: However, a small 
expansion in Q,, will raise W and is prof- 
itable, as the derivative of the profit func- 
tion in the proof to Theorem 1 suggests. 
Consider a case in which M becomes a 
Stackelberg leader that maximizes its profits 
while fully accounting for F’s reaction. Since 
M may choose Q,, = Q%, it must be able to 
increase its profit. The proof indicates that 
it must be with Oy > QÑ- s9 

4. The proof in Theorem 1 relies on the 
condition that k[1 + dQ, /dQ,,|<1 but not 
directly on firms in F being Cournot firms. 
‘Assume Al1* and consider, for example, 
firms in F behaving in a symmetric conjec- 
tural variation mode. Equilibrium for F is 
characterized now by the symmetric first- 


"Tf after the merger there is a sequential game in 
which M (who deviates from the original equilibrium) 
moves first, then in a perfect equilibrium of this game 
M behaves à la Stackelberg. 

17 evin (1988) provides a direct proof that a Cournot 
firm that becomes a Stackelberg leader will raise its 
output. However, ‘the analysis there assumes no change 
in the total number of firms. 


DECEMBER 1990 


order condition 
(10) P(Qy+Qp) 
+ P'(Oyt Op) Opy/k-c=0 


where (y — 1) is the conjecture. by each firm 
in F on other firms’ responses to its change 
in output. Thus, y=0 is the competitive 
case, y = 1 is the Cournot case, and y > 1 is 
the case when firms in F expect other firms 
to expand output in response to an increase 
in output by ci firms. From (10), 1+ 
dQ: /dQy = P'/[P’k /y +(P’ + P"Qp)l. 
Further analysis shows that if (k ~-1)P’- 
P"Q, = 0, any y> Q satisfies this condition. 
However, if (k —1)P'— P"Q, <0, which is 
more plausible, any 0 < y < kP'/[(k —1)P’ 
+ P"Qpl= z satisfies this condition. Note 
that, by A2, z>1 in the last case. Thus, 
there is a whole range of conjectures for F, 
including some range where y>1, where 
contraction of output by M is not desirable 
while profitable expansions will raise wel- 
fare. 


Iii. Summary and Conclusions 


The analysis here seems to suggest, under 
quite general conditions, that profitable hor- 
izontal mergers that start with less than 50 
percent of premerger market share are wel- 
fare-enhancing. However, | a few miponani 
caveats are in order. 

1) The analysis here ignores income dis- 
tribution. Expansionary profitable mergers 
reduce market price and benefit consumers 
but reduce the profits of the fringe. Con- 


` tractionary profitable mergers, which are 


possible when ‘cost-reduction incentives ex- 
ist, hurt consumers and benefit the fringe. 
Theorem 1 suggests that expansionary 
mergers are more likely. 

2) To simplify, I did not model the merger 
process; instead, I implicitly assumed ‘that 
firms being acquired react passively. Morton 
Kamien and Israel Zang (1988, 1990) model 
such processes where firms behave strategi- 
cally with respect to such activity. They find 
only limited. scope for such mergers and 
show that, with identical constant marginal 
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cost, such mergers vill not take place if 
mergers of firms with more than 50 percent 
oi market share are pzohibited."! 

3) Fixed costs that zan be shared provide 
important incentive fer mergers. Such costs 
introduce discontinuiy in the cost function 
and in general mar “destroy” existence 
and uniqueness of the CNE. More im- 
portant to the analysis is that such fixed 
costs introduce discoautinuity in the function 
QO,(Q,,). When Qu increases beyond Q#, 
firms in F may reach zero profits with strictly 
positive levels of output. It is possible then, 
that expansion in Q,, may cause reduction 
in total output and an increase in market 
_ price. Thus, while Treorem 1 is still valid (if 
one ignores the existance issue of the origi- 
nal CNE), Theorems 2, 3, and 4 are not 
valid in general. 

4) My analysis assumed that the F re- 
mains Cournot. If zhe fringe changes its 
behavior after the merger and becomes (say) 
more cooperative w.th the M, conclusions 
may change. This st-uation will be the sub- 
ject of a future stud7. 


Kamien and Zang - 1988, 1990) assume that the 
merged entity remains ‘Cournot. Their results, espe- 
cially the limited scope fer such mergers, depend heav- 
ily on this assumption. “Iso, welfare analysis is absent 
in their article. 
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Input Market Price Discrimination and the Choice 
of Technology 


By Patrick DEGRABA* 


Recent concerns over the effects of the 
Robinson-Patman Act! and so-called price- 
protection policies such as most-favored- 
customer clauses (MFC’s)* on market 
performance have given economists new 
reasons to examine the welfare effects of 
third-degree price discrimination. In order 
to assess these effects correctly, it is impera- 
tive that one understand how price discrimi- 
nation influences market behavior. 

Joan Robinson’s (1933) work launched 
the formal inquiry into the welfare effects of 
third-degree price discrimination. Building 
on the intuition presented by Arthur Pigou 
(1932), she showed that, if a monopolist 
faces two independent linear demand 
curves, the use of price discrimination will 
not affect industry output but will reduce 
welfare. Richard Schmalensee (1981) ex- 
tends these results to nonlinear demand 
curves and shows that an increase in total 
industry output is a necessary condition for 
price discrimination to be welfare improv- 
ing. Hal Varian (1985) broadens these re- 
sults by deriving upper and lower bounds on 
the welfare change due to the use of price 
discrimination. He shows that these results 
can be applied to markets in which there 
are nonzero cross price effects. 

All of this work examines how the ability 
of a monopolist to price-discriminate will 
affect the market outcome when all other 
characteristics of the market are treated as 
exogenous. Recently, two lines of research 
have extended this inquiry beyond the case 


* Johnson Graduate School of Management, Cornell 
University, Ithaca, NY 14853. I thank Robert Frank, 
Robert Smiley, Richard Thaler, and the participants of 
the JGSM applied microeconomics workshop for their 
helpful comments. 

1See William Baldwin (1987 pp. 438-40) for a good 
summary of the debate. 

*See John Kwoka and Lawrence White (1989 pp. 
196-7). 
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of a monopolist in a market with exoge- 
nously fixed parameters. 

The first line considers the case of 
oligopoly. The work of Charles Holt and 
David Scheffman (1985) and Thomas 
Cooper (1986) has shown that restrictions 
on price discrimination imposed by the use 
of MFC’s can facilitate collusion between 
oligopolists attempting to restrict output. 
This implies that third-degree price discrim- 
ination can be welfare-improving. 

The second line of research shows that 
price discrimination by a firm can affect 
nonprice decisions made by other market 
participants, thus affecting the market out- 
come. Michael Katz (1987) presents a model 
in which a large firm’s ability to vertically 
integrate backward into the production of 
an input allows it to obtain a lower per-unit 
price from the supplier of the input than 
can be obtained by smaller firms without 
this ability. He shows that third-degree price 
discrimination reduces welfare unless it pre- 
vents inefficient backward integration. De- 
Graba (1987) shows that the use or nonuse 
of price discrimination by a national firm 
can affect nonprice decisions made by local 
firms that compete with the national firm. 
In this situation, third-degree price discrim- 
ination is welfare-reducing, because it in- 
duces local firms to produce a product that 
is “overly differentiated” from the product 
of the national firm. 

In all of the work cited above, price dis- 
crimination is important when sellers set 
prices in separate markets or charge differ- 
ent prices to different customers in the same 
market. The following analysis (which can 
be considered a contribution to the second 
line of research) suggests that price discrim- 
ination can be important even when a seller 
faces a single market in which all customers 
are identical. The intuition behind this re- 
sult is that nonprice decisions made by 
downstream producers (such as the choice 
of technology) can be affected by the use or 
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nonuse of price discrimination by the sup- 
plier of an input. Even though the down- 
stream producers will reach a symmetric 
equilibrium, the equilibrium that they reach 
under price discrimination will be different 
from the one they reach under uniform 
pricing. This means that the pricing strategy 
of the supplier can determine which sym- 
metric equilibrium producers choose. 

I present a simple model to examine how 
price discrimination in a market for a vari- 
able input affects downstream producers’ 
long-run choice of a production technology. 
In the model, a monopoly supplier of a 
variable input sells to two downstream pro- 
ducers, who use the input in the production 
of a final good. These producers must first 
choose a level of marginal cost (a lower 
marginal cost is obtained by incurring a 
higher fixed cost) and then compete as 
Cournot duopolists who face a linear de- 
mand curve in the final goods market. I 
compare the market equilibrium in which 
the supplier is allowed to price-discriminate 
to the equilibrium in which the supplier 
must charge a uniform price to both down- 
stream firms. This comparison yields two 
main results. 


1) If the downstream producers have dif- 
ferent constant marginal costs of produc- 
tion, the price-discriminating input sup- 
plier will charge the low-cost producer a 
higher price than he charges the high-cost 
producer, partially offsetting the cost ad- 
vantage. 

2) The downstream producers will choose a 
technology with a higher marginal cost 
when the supplier price-discriminates 
than they will if the supplier charges a 
uniform price. Further, if the trade-off 
between marginal cost and fixed costs is 
a quadratic relationship, welfare in the 
long run is lower under price discrimina- 
tion. 


I. The Short Run 


In the formal model, I present a two-stage 
three-player game. The players include a 
monopoly supplier of a variable input and 
two downstream producers of a final good. 
The supplier sells the input to the produc- 
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ers, who use this input along with others to 
produce a homogeneous output. Units are 
normalized so that one unit of the input is 
required to produce one unit of the output. 

In the first stage of the game, the 
monopoly supplier quotes a per-unit price, 
k,, for the input to each downstream firm, i. 
In stage 2, the downstream firms observe 
these prices and then compete as Cournot 
duopolists who face a linear market demand 
curve. Firm 7’s per-unit cost of output is 
k;+ c; where c; is an additional marginal 
cost of production. It may be helpful to 
think of c; as the cost of using other vari- 
able inputs which may differ across firms 
due to such reasons as different geographic 
location or the use of different technologies. 
I assume that each c, is set in a competitive 
market, so it represents the true cost of the 
resources used. 

The strategy for the supplier is an or- 
dered pair (k,,k,)@R**, where k, is the 
price quoted to firm 1 and k, is the price 
quoted to firm 2. The strategy for each firm, 
i, is a function, Q; R?* > R'*. This func- 
tion maps every possible observed combina- 
tion of k, and k, into a quantity choice. 

The payoff to the supplier is mrs = L,k,q;, 
which is the revenue generated from the 
sales of the input. (The supplier produces 
the input at zero marginal cost). Let p be 
the price of the final good. Producer i’s 
payoff is w,=(p—k;—c,)q;, which is the 
net revenue from sales in the final goods 
market. 

The equilibrium concept employed is 
that of subgame perfect Nash equilibrium 
(Reinhard Selten, 1975). An equilibrium 
strategy choice is an ordered quadruple 
(k*,k*,Q*,Q*) such that: 1) no player 
could improve his payoff by unilaterally de- 
viating and 2) (Q#*,Q#) constitute Nash 
equilibrium choices of q, and q,, for every 
possible choice of (k,, k3). 

The calculation of the perfect Nash equi- 
librium proceeds in two steps. The first step 
is to note that, in stage 2, the producers 
simply choose the (unique) Cournot output 
levels, (q*,q), for each subgame defined 
by treating k,+c, and k,+c, as the 
marginal costs of production. 

Once this has been done, the first stage of 
the game can be viewed as a profit-maximi- 
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zation problem for the supplier in which he 
simply chooses (k,,k,) to maximize mg = 
k,Q# +k,Q*. The payoff to each of the 
producers then is the profit of the subgame 
defined by the supplier’s choice of prices. 
When the market demand curve is of th 
form 


(1) p=a—b(q,+4) | 


where a,b > 0, the Nash equilibrium strate- 
gies for the producers are given by 


(2a) Q;*=(a—2c,—-2k,+¢,+k,)/3b 


The structure outlined thus far can be 
used to define two games, T° and F”. In? 
the supplier is able to price-discriminate, 
and in T" the supplier is constrained to 
charge a uniform price. When the supplier 
is allowed to price-discriminate, a simple 
maximization calculation reveals that the 
equilibrium values of k, and k, are 


(3a) 
(3b) 


ki*=(a—c¢,)/2 
k3* =(a—-c)/2. 


_ Likewise, when the supplier is con- 
. strained to charge the same price to both 
firms, a constrained maximization calcula- 
tion yields 


(4) k¥*=(2a—c,—c,)/4. 


OBSERVATION 1: When the supplier is al- 
lowed to price-discriminate, he charges the 
firm with the lower marginal cost a higher 
price than he charges the firm with the higher 
marginal cost. This price differential is less 
than (the absolute value of ) the marginal cost 
differential. 


This can be seen by setting c, < c, (without 
loss of generality) in (3a) and (3b). Katz 
(1987) presents this result for Cournot play- 
ers facing any demand curve that has a 
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downward-sloping marginal-revenue curve.? 
The reason for this result is clear. The firm 
with a lower marginal cost has the more 
inelastic demand for the input, which causes 
the supplier to charge him a higher price. 
The result is easily seen in this example, 
because the demand for the input is a linear 
function of the prices of the inputs. The 
additional marginal cost’ of production, c; 
affects this demand function only through 
the constant term, and it does so with a 
negative sign. Thus, a lower c; causes the 
demand for the input to be more inelastic, 
which leads to a higher price. 

The general restrictions on downstream 
competition under which k¢* > k"* > k$* if 
and orily if c} <c, are not known.’ It is 
easy to show that this relationship holds if 
the demand curves of downstream firms are 
linear and symmetric. Note that this in- 
cludes the Cournot example above as well 
as price-setting games in which the demand 
in the final goods market is linear (the latter 
set of games includes the case in which 
firms are symmetrically placed along a 
Hotelling line, where all consumers have 
the same reservation price for the good, as 
a special case). Note that since k{*, kł*, 
and k“* are derived from interior equilib- 
rium points, the relationship k{*>k"*> 
k$* if and only if c,<c,, must hold for (at 
least) “small”? uniformly continuous pertur- 
bations of both the symmetry of the demand 
curves and the linearity. 


*This result implies that the firm that purchases 
more of the input pays a higher price. This might seem 
to contradict the more intuitive notion that larger users 
tend to pay less than small users. The apparent contra- 
diction stems from the fact that quantity discounts are 
used as a self-selection mechanism when the seller 
does not know the demand curves of the buyers. In the 
example, the seller does know the demand curve for 
each buyer, so quantity discounts are unnecessary. 

In fact, the conditions under which a monopolist’s 
optimal discriminatory prices will bracket his optimal 
uniform price are not completely known. DeGraba 
(1989) provides some general restrictions on the 
monopolist’s profit functions for which this is true. 
Similar results -for the case of independent markets 
were developed simultaneously and independently by 
Babu Nahata et al. (1990). 
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OBSERVATION 2 When the supplier 
price-discriminates, the low-marginal-cost pro- 
ducer produces less, and the high-marginal- 
cost producer produces more than they would 
under uniform pricing. In the short run, there- 
fore, welfare is lower under price discrimina- 
tion. 


PROOF: 
Direct calculation shows that, with price 
discrimination, firm £ produces 


(5a) qF = (a—2c,+c_,)/5b 


while with a uniform price, the low-cost firm 
produces 


(Sb) gf =[a—-(7/2)c; +(5/2)c_;| /6b 


where — ¿į indicates irm i’s opponent. 

It is obvious from inspection that, if c; < 
C_p then (Sb) > (Sa) which implies that the 
low-cost firm produces less when there is 
price discriminatior. Since total industry 
output is unaffected by the choice of the 
supplier’s pricing policy (a direct result of 
linear demand), the high-cost firm must 
produce more. This means that discrimina- 
tory input prices raise the total cost to soci- 
ety of producing the equilibrium quantity of 
output. This is welfare-reducing. 


The lesson to be learned from Observa- 
tions 1 and 2 is simple. When a price-dis- 
criminating supplier of an input sells to 
producers who have different marginal costs, 
there is an incentive for the supplier to 
charge a higher prize to the lower-cost firm. 
As a result, the supplier partially “sub- 
sidizes” the cost differential between the 
two firms. When the supplier sets a uniform 
price, he does not provide this subsidy. Thus, 
price discrimination results in a smaller cost 
differential between the two firms, which 
causes the lower-cost firm to produce less 


SIn general, there will be two welfare effects, one 
resulting from realocating production among 
different-cost producer: and the other resulting from a 
change in total outpu . It is difficult to make state- 
ments about welfare wien these twa effects conflict. 
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and the higher-cost firm to produce more 
than under uniform pricing. This fact will 
drive the results of the next section. 


II. The Long Run 


In this section, I present a model that 
suggests that the pricing policy of the sup- 
plier can affect decisions made by down- 
stream producers other than the short-run 
choice of quantity or price. Specifically, each 
producer faces the task of choosing his pro- 
duction technology, which will affect the 
firm’s cost structure and, therefore, his abil- 
ity to compete against his rivals. I show that 
when the supplier charges discriminatory 
prices, the producers choose a technology 
with a higher marginal cost than they would 
when the supplier sets a uniform price. 

I extend the model of Section I by allow- 
ing each downstream producer to choose 
his level of marginal cost. A lower marginal 
cost, c;, can be obtained at the expense of a 
higher fixed cost, F, In order to obtain a 
closed-form solution, I use the function, F; 
= ac? — Bc, + y, for 0 < c; < B/2a. As with 
the c,’s, I assume that the F,’s represent the 
true cost of the fixed resources used by the 
producers. 

The following restrictions must now be 
placed on the parameters: 


(R1) a,B>0 

(R2) (1/9b)-—a <0 

(R3) (7/4)(a/9b)—- B <0 
(R4) aa—(B/2)>0 

(R5) y <a’ /36b. 


Inequality (R1) implies that F; is down- 
ward-sloping and convex for 0 < c; < B/2a. 
Inequality (R2) is the statement that the 
second derivative of profit with respect to c 
is negative. Expressions (R3) and (R4) are 
sufficient to guarantee that the first-order 
conditions yield interior solutions: (R3) 
states that F, is steeper than the net rev- 
enue function (r, defined as sales minus 
variable costs) when the supplier is setting a 





Cy 


B/20 a 


FIGURE 1 


uniform price and c,=c,=0; (R4) is the 
statement that F, reaches a minimum at a 
value of c; <a (see Fig. 1). Finally, (R5) is a 
sufficient (but by no means necessary) con- 
dition for producers to earn a strictly posi- 
tive profit. It says that the fixed costs associ- 
ated with a zero marginal cost are less than 
the net revenues earned at c,;=0. This is 
clearly a much stronger condition than nec- 
essary, but it is simple and is sufficient for 
my purposes. 

I can now present a three-stage game. 
The producers choose c; in stage 1. I as- 
sume that this choice is made with the 
knowledge of whether or not the supplier 
will employ a uniform price. This assump- 
tion can be justified on the grounds that 
offering or not offering an MFC has been a 
customary practice in the past. An example 
of this might be a labor union pledging to 
negotiate the same wage contract with all 
manufacturers in a given industry. There 
may also be laws such as the Robinson- 
Patman Act, which effectively prevents a 
supplier from price-discriminating among 
downstream competitors. 

Once the technology decisions have been 
made, the gäme proceeds as described in 
Section I. The supplier offers a price to 
each firm, and the firms choose quantity 
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taking their technology choices and input 
prices as given. 

Formally, the strategy of the supplier is a 
function K: R?+ > R?*, K maps every pos- 
sible ordered pair of (c1, c,) into input price 
choices (k,,k,). The strategy of producer i 
has the form (c;,Q;), where the number 
c; 10, B /2a] is his marginal cost of pro- 
duction, and the function Q: R> RIY, 
maps (c, +k,,c,+k,) into a quantity 
choice, q;. The supplier’s payoff is aw. = 
Likid; and the payoff to producer i is 


(6) 7,=[a—-b(a# + q¥)—¢;—k, ax 
~|ac? ~ Be; + y]. 


The equilibrium concept is that of perfect 
Nash equilibrium. 


PROPOSITION 1: The technology chosen 
by producers under price discrimination has a 
higher marginal cost than the technology cho- 
sen under uniform pricing. Output in the final 
goods .market is therefore lower under price 
discrimination than under uniform pricing. 


PROOF: 

As with all multistage games, the appro- 
priate procedure is to solve the game back- 
wards. The stage 2 and 3 solutions have 
already been calculated in the previous sec- 
tion. It is only necessary to solve the first 
stage of this game. 

To calculate the choices of c, and c, 
when the supplier is allowed to price-dis- 
criminate, equations (3a), (3b), and (Sa) must 
be substituted into (6). Solving the first-order 
conditions simultaneously yields ' 


(a —9bB) /(1-18ba) 


along with the other equilibrium ‘values: i 


(Ja) cf =c} = 


(7b) kf =k$ 
= —9b(2aa — B) /2(1—18ba) 


(7c) q} = 
= —3(2aa — B)/2(1—18ba). 
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When the supplier is constrained to 
charge a uniform price, equations (4) and 
(5b) are substituted into (6). Solving these 
first-order conditions y elds: 


(7/4) a—9bB 


(8a) 12 = 18a 


—95(2aa — B) 


(S ka 2 (7/4) —18ba] 


$ AS —i(2aa— ß) 
(8c) a= 42 = 570574) iBba] ` 


From the restrictions placed on the pa- 
rameters, we know tnat (7/4)a —9bB <0, 
(7/4)—18ba <0, anc 2aa—ß>0. These 
inequalities guarantee that (7a)-(7:) and 
(8a)-(8c) are positive It is then simple to 
show that c} <c$ if and only if aa — ß/2 
>0, and the fact that q} > qf is obvious 
from inspection. 


This model suggests that, when the sup- 
plier price-discrimina.es, the producers will 
choose a technology sith a higher marginal 
cost than they will when the supplier charges 
a uniform price. The reason is simple. It 
was seen earlier (Observation 1) that if there 
is a difference in marginal costs, the dis- 
criminating supplier will charge the pro- 
ducer with a low marginal cast a higher 
input price, partially offsetting the cost dif- 
ferential. 

In this case, the pice-discriminating sup- 
plier reduces the incentives to reduce 
marginal cost unilaterally, because the ad- 
vantage gained throigh achieving a lower 
marginal cost is par ially offset by the dif- 
ferential in the supplær’s input prices. When 
the supplier charges a uniform price, each 
firm receives the full benefit of a cost reduc- 
tion (since none of the advantage is dissi- 
pated through price discrimination). Thus, 
unilateral cost reductions that are marginally 
profitable under uriform pricing are un- 
profitable under pre discrimination. This 
results in a higher ecuilibrium marginal cost 
under price discrimination. The higher 
marginal cost then translates into a lower 
level of output in th= final goods market. 
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While the results are shown for the sim- 
ple linear case above, the intuition can be 
generalized as follows. Consider the game 
in which each competitor chooses a level of 
c; If the best-response functions of this 
game are either downward-sloping or up- 
ward-sloping with slope less than 1 and the 
monopolist finds it optimal to charge the 
low-cost firm more than he charges the 
high-cost firm, then price discrimination re- 
sults in a higher choice of c and lower 
output than does the use of uniform pricing.° 

It should be noted that, unlike the results 
in the short run, the supplier’s profit in the 
long run is lower under price discrimination 
than it would be under uniform pricing. The 
reason is that discriminatory prices induce 
the firms to choose a higher marginal cost, 
which causes the supplier to charge a lower 
price. Since less output is produced, the 
supplier sells less of the input at the lower 
price, resulting in a lower profit. 


PROPOSITION 2: Welfare (as measured by 
the sum of consumer and producer surplus) is 
lower under price discrimination than under 
uniform pricing. 


PROOF: 
I prove this proposition using two lem- 
mas. 


LEMMA 1: Each producer’s average cost 
(in terms of real resources used) of producing 
q” under uniform pricing (AC") is less than 
his average cost of producing q“ under price 
discrimination (AC°) 


PROOF: 
See the Appendix. 


LEMMA 2: In equilibrium, under uniform 
pricing, the price in the final goods market is 


°See DeGraba (1988) for the proof. If best-response 
functions have slopes between —1 and 1, then the 
equilibrium will be stable, as in Avinash Dixit (1986). 
The result can be extended further to games with 
best-response functions with slope less than —1, but 
the interpretation of such an equilibrium is trouble- 
some. 
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greater than the praoducer’s average cost of 
producing q”. 


PROOF: 

This follows immediately from the fact 
that each firm earns a positive profit, which 
is guaranteed by (R5). 


Lemmas 1 and 2 can be used to generate 
the graph in Figure 2 for the final goods 
- market equilibria. This graph clearly shows 
that welfare under price discrimination is 
lower than welfare under uniform pricing. 

The proof suggests that the use of price 
discrimination creates two effects which de- 
crease social welfare. The first is the fact 
that output is lower under price discrimina- 
tion. This is not surprising. Decreasing out- 
put virtually always decreases welfare in 
Cournot models. 

The second is that price discrimination 
changes the marginal-cost /fixed-cost “mix 
Under price discrimination, producers 
choose a level of c that is too high (and 
correspondingly a level of F that is too low) 
from a welfare point of view. The use of 
uniform pricing causes producers to choose 
a lower c and a higher F, which in this 
situation is more efficient. 

To illustrate, consider a producer who 
must choose c to minimize the following 
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cost function: 


(9) TC= ac? — Be, +y+c.4,+kyq, 


where all parameters are positive, k, is not 
a function of c,, and q, is fixed. For sim- 
plicity, let the input for which k, is the 
price, have a zero cost of production. The 
first-order conditions tell us that the cost- 
minimizing c, is given by 


(10) =(8-4q,)/2a. 


Notice that this choice of c¥ also minimizes 
society’s cost of producing q,. 

Now let k, be a (decreasing) function of 
c,. This will alter the first-order conditions 
from (9) so that the firm’s optimal choice of 
c; no longer minimizes society’s cost of pro- 
duction. Substituting Ga) into (9) and (4) 
and (9) and optimizing, one obtains 


(ila) cf=[B-(1/2)q,]/2a. 


(11b) ss c#[B-(3/4)a,] /20 — 


respectively. Notice that c is farther from 

* than is.cy. This is because the use of 
discriminatory prices “augments” the effect 
of c; on k, vis-a-vis uniform pricing, which 
increases the distortion caused by this func- 
tional relationship. Thus, price discrimina- 
tion causes firm 1 to choose c farther from 
c¥, thereby increasing the cost of produc- 
tion and reducing welfare. 

The message from this analysis is clear. 
When a supplier finds it in his interest to 
charge low-cost producers a high price and 
high-cost producers a low price, price dis- 
crimination creates incentives that discour- 
age the reduction of marginal cost at the 
expense of fixed cost. This results in an 
industry output .that is lower than would 
occur under uniform pricing. This decrease 
in output, along with the use of a less effi- 
cient marginal cost/fixed-cost mix, pro- 
duces welfare that is lower than would oc- 
cur under uniform pricing. 
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APPENDIX 


PROOF OF LEMMA 1: Consider a firm 
that has chosen -he technology with 
marginal cost c4 and a firm which has cho- 
sen the technology with marginal cost c" 
Suppose they each produce q” units. A fair 
amount of algebra is required to show that 
the total cost for the irm with technology c" 
is less than the total cost for the firm with 
technology c°. Subtracting the total cost of 
production of the firm with marginal cost c" 
from the total cost of the firm with marginal 
cost c| yields 


(A1) ATC =a(c4)°— Bet + c4g" 
— | a(z")? - Bc” + c"q"] ; 


Plugging (7a), (7c), a), and (8c) into (A1) 
and simplifying yielcs 


[(1/4)(9ba +1)(2aa— B)]. 


This is clearly positive. Since both firms are 
producing the same output level, the firm 
with marginal cost cê also has a higher 
average cost of procuction. 

Now let the firm with marginal | cost cì 
reduce its output from q? to q^. Since 
(F+cq)/q is a decreasing function of q 
for F, c>0, decreasing: output increases 
average cost. Therefore, a firm producing 
q? with technology, c4 has: a higher cost 
than-a firm produdng q” with technology 


(A2) ATC= 


oc". 
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Direction of Price Changes in Third-Degree 
Price Discrimination 


By Basu NAHATA, KRZYSZTOF OSTASZEWSKI, AND P. K. SAHoo* 


There is a long tradition in economic 
literature of comparing welfare under dis- 
criminatory pricing with nondiscrimina- 
tory pricing employing the usual Marshall- 
ian measure. For example, Richard 
Schmalensee (1981), by assuming indepen- 
dent demand and constant marginal cost, 
clearly shows that if welfare were to in- 
crease under discrimination, then the total 
output with discrimination must be higher 
than the total output without discrimina- 
tion. Hal Varian (1985) analyzes the welfare 
question in a more general setting. He de- 
velops bounds that serve as a sufficient con- 
dition for welfare to increase. In addition, 
these bounds provide important insights in 
predicting the welfare change under differ- 
ent demand and cost functions. Marius 
Schwartz (1990) proves that, for any cost 
function, if discrimination decreases output. 
it will decrease welfare, as well. 

There are two reasons why the welfare 
question has been so difficult to answer. 
1) Based on his analysis Schmalensee (1981 
p. 245) concludes that “If all demand func- 
tions are strictly concave or convex and if 
the p,’s [prices in each submarket after dis- 
crimination] are not nearly equal, there is 
apparently no simple, general way to tell if 
monopolistic discrimination will raise or 
lower total output.”! Given this absence of 


*The first author is in the Department of Economics 
and Finance, and the last two authors in the Depart- 
ment of Mathematics, University of Louisville, 
Louisville, KY 40292. Comments of Stephen Layson 
and Marius Schwartz were very helpful in revising the 
paper. The geometrical explanation for our results 
using Figures 1 and 2 is due to Stephen Layson, for 
which we are thankful. We also thank other anony- 
mous referees. We acknowledge the research assis- 
tance of Zhao-Bi Ha. Our research was partially 
supported by the University of Louisville. An earlier 
version of this paper was presented at the Fourth 
Congress of the European Economic Association in 
Augsburg, West Germany, in September 1989. 
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a simple test, it is not possible to verify 
whether the necessary condition (i.e., in- 
crease in output under discrimination) is 
satisfied or not. 2) Even if one were able to 
develop a test that could predict when the 
total output would increase under discrimi- 
nation, it may not be very useful in answer- 
ing the basic question of welfare change, 
because an increase in total output is only a 
necessary condition and hence does not 
guarantee an increase in welfare. Thus, to 
answer the welfare question, a new ap- 
proach is needed for theoretical analysis of 
third-degree price discrimination. 

Instead of concentrating on the output 
effects of discrimination, this paper focuses 
on the price effects of discrimination. A 
basic result that has remained unquestioned 
in the literature is that when there are two 
classes of buyers, discrimination raises price 
for one class and lowers it for the other. 
However, in an interesting paper focusing 
on price discrimination in intermediate good 
markets, Michael Katz (1987 p. 156) con- 
cludes that “Under reasonable conditions, 
intermediate good price discrimination leads 
to higher input price being charged to all 
buyers, a result that never would arise in a 
corresponding model of a final good mar- 
ket.” According to Katz this surprising re- 
sult is a striking example of the difference 
between intermediate and final good mar- 
kets. We not only show that discrimination 
in the final good market can also raise prices 
for all buyers (the result denied by Katz), 


1Joan Robinson (1933 pp. 192-5) proposed a test 
based on the adjusted concavities of the submarkets’ 
demand functions at the nondiscriminating monopoly 
price to determine whether total output rises or falls 
after discrimination. However, Melvin Greenhut and 
Hiroshi Ohta (1976}, with the help of an example, have 
clearly demonstrated that Robinson’s proposed test is 
not valid under all conditions and, hence, seems to 
have little real value. 
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but further show that it can also lower the 
prices for all buyers. E is important to em- 
phasize that when prices move in the same 
direction in both markets, the welfare effect 
of discrimination may be quite large com- 
pared to the more trpical situation when 
price rises in one merket and falls in the 
other. When both prices move in the same 
direction, the welfare effects are pre- 
dictable. When both prices go down, con- 
sumer’s surplus increases because of the 
price movement, whil= profit increases due 
to discrimination; thus, welfare must go up. 
When prices go up, the total output is re- 
duced, which causes welfare to decrease. 


I. Mair Results 


We first give two examples to demon- 
strate that price discrimination can either 
increase prices in both markets or decrease 
prices in both markets. We do not restrict 
the demand functions to be strictly concave 
or convex. We only require that they are 
continuous and twice differentiable with 
negative slope throuzhout their range. We 
also do not restrict tke profit functions to be 
strictly concave or restrict marginal revenue 
curves to be declin.ng continuously (e.g., 
Schmalensee, 1981 >. 243). John Formby, 
Stephen Layson, and James Smith (1982) 
clearly demonstrate that the assumption of 
continuously declinng marginal revenue 
may be quite restrictive, and “...demand 
conditions leading to upward sloping 
marginal revenue may indeed be pervasive” 
(Formby et al., 1982 p. 306). Based on their 
examples they concLide that, “...very sim- 
ple analytical demand curves may have 
non-trivial upward soping marginal revenue 
curves and that multiple profit equilibria for 
firms cannot be easily dismissed” (p. 309). 
We use polynomia: demand functions to 
illustrate these results for the following rea- 
sons. First, a polynomial demand function 
relaxes the assumptzon of strict concavity or 
convexity, and at “he same time, it also 
rélaxes the assumptions of declining 
marginal revenue and concavity of profit 
functions. Second, any sufficiently smooth 
demand function cen be approximated with 
a polynomial using Taylor’s expansion with 
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FIGURE 1. OPTIMAL PRICES WITH CONCAVE 
Prorit FUNCTIONS 


any desired degree of accuracy. Thus, a 
polynomial demand function, being more 
general, can provide an analytical frame- 
work for identifying other types of demand 
functions where similar results may hold. 
For the sake of illustration, we consider 
only two markets and constant marginal cost. 
Both markets are served with and without 
discrimination. 

Figures 1 and 2 provide geometrical ex- 
planation for our results. If #,(p) and m(p) 
are profit functions for submarkets 1 and 2, 
respectively, and w(p)=7,(p)+7(p) is 
the uniform-price profit function, then at 
the single uniform profit-maximizing price, 
p*, w'(p) = 1i(p)+ 74(p) = 0. Note in Fig- 
ure 1 that, at p*, wi(p)<0 and 15(p)> 0. 
Therefore, if a, and m, both are concave, 
př < p* < p#, the usual case. If, however, 
ar, has two local maxima, then it is possible 
that př < p* and př < p* or př > p* and 
px > p*. The former case is illustrated in 
Figure 2 (for expository purposes, the shapes 
of the graphs of three profit functions are 
more exaggerated for Example 1). 


Example 1: Discrimination lowers prices in 
both markets. 


Market 1 demand: 
QO,=- 0.25p? +2.0001 p? —5.5p,+10 
Market 2 demand: 


Q, = — 0.2561 pł} +2.7p3 —9.5p, +12 
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nen,+n, 


Ps P* P 
FIGURE 2. OPTIMAL Piens WITH UNRESTRICTED 
PROFIT FUNCTIONS 
Combined demand: 
Q = — 0.5061 p? + 4.7001 p? — 15p +22 
Marginal cost: 
c=0.1 
Profit a(p) without discrimination: __ 
7(p) =—0.5061 p4*+4.7001 p?—15p?+22p 
~0.1{ — 0.5061 p? 
+ 4.7001 p* —15p +22} 
= — 0.5061 p* + 4.75071 p? 
— 15.47001 p? +23.5p —2.2 


Profit maximization yields? 


p* =3.859496, Q* =5.0232297, 
ar* = 18.884816, Q,=4.1931976, 
Q, = 0.8300321. 


Profit 7(p,, p2) with discrimination: 


T( Pi» P2) 
= —0.25p{Í +2.0001p —5.5p? +10p; 
—0.2561p4 +2.7p3 —9.5p2 +12p, 
—0.1{—0.25p? +2.0001p? —5.5p, 
+10—0.2561p3 +2.7p2 —9.5p, +12} 


“All these values represent global optima. Detailed 
numerical computations for the two examples are avail- 
able from the authors upon request. 
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Profit maximization yields 


I 


př =3.809910 p¥ =1.097442 
- OF =4.2521695 OF = 4.4876276 

wf =15.775169 m# = 4.476147 
p*>př> př and w*<7*=r*¥+7F. 


Note that the output in market 2 increasés 
substantially due to discrimination. 


Example 2: Discrimination raises prices in 
both markets. 
Market 1 demand: 

Q, = —0.25p} +2.0001 p? —5.5p,+6 
Market 2 demand: | 

Q, = —0.25p3 +2.655p2 —9.5p, +12.5 
Combined demand: 

Q= -0.5p? +4.6551p? ~15p +18.5 


Marginal cost: 
c=(0.1 
Profit m(p) without discrimination: 


m(p) 
= —0.5p* +4.6551 p? —15p? +18.5p 
~0.1{—0.5p3 +4.6551p?—15p +18.5} 


Profit maximization yields 


p* =1.158687, O*=6.5916246, 
7* = 6.978467, Q, =1.9235665, 
Q. = 4.6680581. 


Profit 7 Pı: P2) with discrimination: 


(Pi, P2) 
= —0.25p/ +2.0001p?—5.5p? +6p, 
=0.25p4 + 2.655p3 —9.5p3 +12.5p, 
~0.1{—0.25p? +2.0001p? —5.5p, 
+ 6—0.25p3 +2.655p3 
—9.5p, +12.5) 
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Profit maximization yjelds 


p* =3.013776 pž =1.170637 
Q*¥ =0.7474162 Qž = 4.616279 
m* =2.177805 w# = 4.942359 


p*<p#<p* and w*<v*=arf+ rf. 


Note that the output in market 1 decreases 
significantly due to d scrimination. 


We now provide sufficient conditions un- 
der which the single mondiscriminatory price 
will always be lower than the maximum of 
the prices in the subrmarkets but higher than 
the minimum of the prices in the submar- 
kets. These results are stated as a theorem 
and its corollaries. l 

Consider a monopolist selling a product 
in a single market. Let Q=Q(p) be the 
demand function, where p is the price. Let 
q be the profit function, and let C(Q) be 
the total cost function. Then, 


7(p) = pQ(p)-C(Q). 


Now suppose the monopolist segments the 
market into n distinguishable submarkets, 
where n > 2 is a positive integer. Let Op) 
be the demand function in ith market, i= 
1,2,...,”. If the monopolist discriminates, 
the total profit is 


n 
PÒ Pis P23- Pn) zi > mi( pi) 


i=1 


ý £ 7.2,~¢| D 2, 
: i=1 


i=] 


where 7r;-is the prcfit in the ith submarket. 
Let (p#, p},..., p*) be the price vector 
maximizing the total profit 7(p,, Pas- <-s Pn) 

If no discrimination occurs, the profit 
equals w(p) = #(p, p,..., p). 


THEOREM 1: Jf for each i=1,2,...,n, 
mpi) is, for p,;>0, a continuous function 
with a global maxirium at pë such that, for 
Dp; <p*, wf{p;) is strictly increasing and, 
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for p,;> př, m(p;) is strictly decreasing (so 
that the profit function is single-peaked), 
then (př, pž,..., př) maximizes 
T( Pi, Pas- Pn). If p* maximizes 7 (p)= 
Cp, p,...,p), then 


min{ př, p#,...,D*} 


< p* <max{ př, p},..., p*}. 


PROOF: 

For p< min(p*, px,..., p*) consider 
(Dp, p,...,p)= TC p) = Limp). For p; < 
př, each 7;(p,) is an increasing function, so 
that for p< min(p*, p*,...,p*), w(p) is 
increasing. Similarly, for p > max(p*, 
p*,...,p*), w(p) is decreasing. Also, (p) 
is a continuous function on the compact 
interval [min(p#*, p¥,..., p*), max(p#, 
p#,...,p*)|; thus, it obtains a maximum 
there. The maximum is global since the 
values of am(p) for p not in 
[min(p¥*, p¥,..., p*), max( př, p},..., p*)] 
are smaller than for p in the interval. Since 
p*e [min(p}, p¥,..., př) max(pf, 
ps,.--,p*)], this proves the theorem. Note 
that we make no assumption about the de- 
mand and cost functions, but only about the 
profit function. 


Corollaries 1 and 2 are immediate. 


COROLLARY 1: If each profit function 7; 
is concave, then the conclusion of Theorem 1 
holds. 


COROLLARY 2: If each Q, for t= 
1,2,... 7, is concave (this includes the case 
of linear demand functions) and c is constant, 
then K conclusion of Theorem 1 holds (for 
D; > 0). 


COROLLARY 3: If the demand functions 
for each market are of the constant-elasticity 
type and marginal cost is constant, then the 
conclusion of Theorem 1 holds. 


PROOF: 
Consider the ith market for i=1,2,...,n. 
We have Q;(p) =p ™, where 7;>1 is a 
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constant. Then 
wi(p;) =(p;—¢)p,™. 


If c = 0, this function does not have a maxi- 
mum, and in fact lim p>0+= +œ, so this 
case is insignificant. If c> 0, the function 
has a unique maximum at p* = c[n; /(n;— 
1)], is increasing for p; < p* and decreasing 
for p; > př. Thus, the assumptions of Theo- 
rem 1 are met. 


Note that the profit functions r; in the 
constant-elasticity case are not concave; 
their concavities change at 


n+l 


„=C ; 
Hi n,—1 


II. Concluding Remarks 


In a recent paper, as a corollary to their 
proposition 2, Jerry Hausman and Jeffrey 
MacKie-Mason (1988) conclude that, “... if 
marginal cost is constant, then with more 
than one market served under uniform pric- 
ing, at least one discriminatory price must 
be higher than the uniform price, so that a 
Pareto improvement is not possible” (p. 256 
and fn. 9, p. 257). Our paper shows that this 
conclusion in general is not true. If demand 
functions are not restricted to a particular 
class, the examples presented above clearly 
demonstrate that third-degree price dis- 
crimination may either lower or raise price 
in all submarkets. Also, the traditional re- 
sult that discrimination increases price in 
some markets and lowers it in the other 
markets can be obtained. 

_ We emphasize that discrimination results 
in a Pareto improvement only when both 
discriminatory prices are lower than the 
uniform price. Hausman and MacKie- 
Mason (1988) show that, when both markets 
are served under uniform price, scale 
economies are necessary for Pareto im- 
provement. They rule out the possibility of 
Pareto improvement in the absence of scale 
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economies if both markets are to be served 
with a uniform price. We show that this 
need not be the case. Pareto improvement 
can occur even when marginal cost is con- 
stant. Thus, Hausman and MacKie-Mason’s 
conclusion that discrimination can result in 
a Paréto improvement could be generalized 
to include the constant marginal cost as 
well. 

Although in this paper we have shown 
that both prices can move in the same direc- 
tion as a result of discrimination, the math- 
ematical and economic conditions under 
which it is true remain unexplored. 
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Third-Degree Price Discrimination and Output: 
Generalizing a Welfare Result 


By Marius SCHWARTZ* 


One of the best-known conjectures in the 
economics of price discrimination is that a 
move by a monopolist from uniform pricing 
to third-degree price discrimination—charg- 
ing different prices in different exogenously 
identifiable markets—reduces the sum of 
consumer surplus and profit (hereinafter 
“welfare”) if total output decreases. This 
conjecture can be found, at least implicitly, 
as far back as A. C. Pigou (1920). It is of 
some interest, since it suggests a welfare 
test that only requires knowledge of observ- 
able magnitudes. Richard Schmalensee 
(1981) proves the conjecture assuming that 
the monopolist can perfectly separate mar- 
kets and that marginal cost is constant. Hal 
Varian (1985) extends the result by allowing 
imperfect arbitrage, so that demand in any 
market can depend on prices in other mar- 
kets, and by allowing marginal cost to be 
constant or increasing. (Schmalensee and 
Varian establish additional useful results on 
the welfare effects of third-degree price dis- 
crimination.) Using a revealed-preference 
argument, this note generalizes the result to 
the case in which marginal cost is decreas- 
ing, a serious possibility in the context of 
monopoly. 

In order to motivate the revealed-prefer- 
ence approach, it is helpful to review the 
intuition for the result when marginal cost 
is constant or increasing and show why that 
intuition can break down when marginal 
cost is decreasing. Suppose that the 
monopoly output under uniform pricing is 
q“ and that moving to discrimination yields 
a total output qf below q". Welfare under 


*Gecrgetown University and Antitrust Division, U.S. 
Department of Justice. The views expressed here do 
not necessarily represent those of the U.S. Department 
of Justice. For helpful discussions and comments, I 
thank Tim Brennan, Maxim Engers, Martin Richard- 
son, Marilyn Simon, Bert Smiley, and Jean Tirale. 
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discrimination will be no higher than if the 
same output qg® is allocated through uni- 
form pricing: uniform pricing allocates a 
given total output optimally (it leaves no 
unexploited gains from reshuffling output 
between markets), while discriminatory pric- 
ing in general will induce misallocations by 
distorting consumers’ choices. Also, welfare 
achieved if q° is allocated through uniform 
pricing will be lower than if the higher out- 
put q” is allocated through uniform pricing. | 
This follows because q” is the monopolist’s 
choice under uniform pricing, so the de- 
mand curve lies above the marginal cost 
curve at q”. If marginal cost is nondecreas- 
ing, demand will lie above marginal cost 
also at lower outputs; hence, reducing out- 
put below q” will reduce welfare. 

If marginal cost is decreasing, this type of 
argument is inconclusive. At some outputs 
below q” the demand curve might now lie 
below the marginal cost curve, as illustrated 
in Figure 1. Thus, welfare under uniform 
pricing, W"(q), can increase over some 
range as output falls below q”. I therefore 
proceed along a different tack, using a re- 
vealed-preference argument that relies on 
q“ being a profit-maximizing output under 
uniform pricing. 

Consider a monopolist selling to n exoge- 
nously identifiable markets. Let p; and q; 
respectively denote the price and output 
sold in market i, i=1,...,n. The 
monopolist’s total cost function is C(q,); 
that is, total cost depends only on total 
output and not on its distribution among 
markets. The markets can be viewed, for 
example, as different types of customers 
(e.g., students, senior citizens), different 
times of purchase (e.g., lunch vs. dinner), or 
different locations to which the monopolist 
ships its output. (In the last case, cost can 
be independent of the output’s distribution 
among markets if, for example, markets are 
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FIGURE 1. WELFARE AND OUTPUT UNDER 
UNIFORM PRICING: 
EXAMPLE IN WHICH WELFARE FUNCTION Is Nor 
j SINGLE-PEAKED 


equidistant to the monopolist’s plant and 
transport cost is constant.) Following Var- 
ian (1985), I allow imperfect arbitrage 
„among markets (with perfect arbitrage, of 
course, price discrimination would be im- 
possible). That is, if price differentials are 
sufficiently high, then goods or customers 
might move between locations, nonstudents 
might obtain fake student ID’s, and dinner 
patrons might switch to lunch. 

It is not necessary to get into details of 
the arbitrage technology. One simply thinks 
of the n markets as representing different 
goods to consumers and allows each individ- 
ual’s indirect utility function to depend on 
the prices of all n goods. In order to use the 
classical welfare measure of total consumer 
surplus plus profit, each individual’s indirect 
utility function is assumed to be quasi-linear 
in the vector of n prices and in all other 
goods, which are treated as a composite 
commodity y and used as the numeraire. 
Under the quasi-linear preferences, one can 
also aggregate across consumers and think 
of the indirect utility function of a repre- 
sentative individual whose endowment in 
‘the numeraire is yj: f(p1,.-->Pas Yo = 
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u(p,,-.-)DP,,)+ Yo. The function v embodies 
whatever substitutability exists among the n 
goods or, equivalently, whatever arbitrage is 
possible among the n markets. (For discus- 
sions of consumer surplus, aggregation, 
quasi-linear utility, and the composite com- 
modity theorem see Angus Deaton and John 
Muellbauer [1980] or Varian [1984].) 

If the monopolist’s n goods are sold un- 
der uniform pricing (p; = p for all i), then 
one can simplify further and think also of 
these n goods as a composite commodity 
whose price is p, and write the indirect 
utility function as 


F( p, Yo) =V(p)+ Yo- 


Note that V(p) gives consumer surplus from 
purchasing the monopolist’s composite good 
at price p Uif one normalizes V by setting 
V(p)—>0 as p>). V(p) is always strictly 
decreasing and weakly convex. Since F is 
linear in yg, the negative of the derivative 
of V, where it exists, gives the demand 
function for the composite good: q = D(p) 
= — V'( p). The only substantive assumption 
is that V(p) is strictly convex, that is, that 
the demand for the monopolist’s composite 
good is a strictly decreasing function of 
price. 

Let W'(qg) denote welfare’ when the 
monopolist maximizes profit subject to be- 
ing constrained to charge uniform prices 
and to sell a given total quantity q: 


(1) W*(q)=V( h(a) +a) 


where /A(q) is the inverse demand function 
and II(g)=h(q)q — C(qą) is profit and 
where, for simplicity, we omit. from welfare 


' the endowment term yp, which is constant. 


Observe: that V is strictly increasing in q, 
since it is strictly decreasing in p and since 
the inverse demand function is strictly de- _ 
creasing. That is, given a downward-sloping 
demand curve, consumer surplus is higher if ` 
a higher output is sold. Whether pricing is 
uniform or not, welfare (again ignoring yọ) 
can also be expressed as utility minus cost: 


Oh. WGacvG E EE Veen). 
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It is now possible tc establish the welfare 
result. 


PROPOSITION 1: Sappose thai p and q” 
= D(p") are a profit-maximizing price and 
output pair when the monopoist is con- 
strained to charge urcform prices. Consider 
any discriminatory price vector p° = 
(Piss Pads Di p; Pr at least some i+ J, 
which yields an associcted output vector q° = 
(q1,-. -»4,), and dencte the total output by 
q? =Xq;. If total outout is lower under dis- 
crimination, then welfare also is lower. That 
is, if q° <q", then Wq*)<W"(q"). 


PROOF: 

I show that W(qg")<W'"“(q°)<W"(q"). 
Consider the first inequality. For any total 
output q, let W*(q) denote the solution to 
the planner’s problem: max U(q,,...,¢,)— 
C(&q;) subject to Zq;=q. Since cost is 
fixed, the planner’s problem is equivalent to 
max U(q,,...,4,) Subject to Eq; =q. Now 
consider W"(q). Since q is the quantity of 
the monopolist’s composite good, g = D(p) 
=qp), where tke outputs [q¢,(p),..., 
q,(p)| maximize utfity given p;= p. This 
means that D(p) solves max U(q,,...,4,) 
subject to pq; = pe, which coincides with 
the planners proHlem. Thus, W"(q)= 
W*(q). Since W*(q_ is the maximum feasi- 
ble welfare given the constraint gq, = q, the 
first inequality is established. 

Consider the second, more novel inequal- 
ity. Given gf <g', it is known that 
Vha’) < VCA- Since q" is a profit- 
maximizing output mot necessarily unique) 
under uniform pricing, Mla" ) < Iq”). 
Thus, by expression (1), qf<q" implies 

W (q?) <W*(q"), 


Intuitively, the firsr inequality reflects the 
fact that, if the cos function depends only 
on total output anc not on its distribution 
among goods or markets, then the con- 
straint Lig, = q can be interpreted as a par- 
ticular transformaton function, one with 
marginal transformation rates of unity. Uni- 
form pricing reflects these marginal rates of 
transformation. Thas, a uniform-price equi- 
librium will maximze welfare for the given 
level of total output, while discriminatory 
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prices generally will not. This is just the 
same logic that underlies the first welfare 
theorem. 

The second inequality is where the re- 
vealed preference argument comes in. It 
shows that—regardless of the shape of the 
cost function—welfare under uniform pric- 
ing is higher at a profit-maximizing output 
q” than at any lower output g°. For more 
intuition, express welfare under uniform 
pricing as total valuation minus total cost: 
W"(q)= B(q)— C(q), where B is the inte- 
gral under the demand curve from 0 to q. 
Since q” maximizes profit, moving from a 
lower output q* to g" must increase rev- 
enue by at least as much as cost: AR > AC. 
Since increasing quantity demanded from 
q® to që would require lowering price, total 
valuation would increase by more than rev- 
enue: AB > p"(q*— q*)>AR. Therefore, 
AW = AB — AC > AR — AC z 0, so welfare 
must increase if, under uniform pricing, 
output is raised to a profit-maximizing level. 
Correspondingly, Figure 1 shows welfare at 
q" to be higher than at any lower output. 

Note that if marginal cost is decreasing 
and the comparison is of two arbitrary out- 
puts, both below the efficient level, then one 
cannot be sure that welfare will be higher at 
the higher output. When the cost function is 
concave, welfare—value minus cost—need 
not be concave everywhere (even though 
value is concave) and therefore need not be 
single-peaked. It is because the higher out- 
put represents a profit maximum that one 
can be sure that welfare there is higher. 

I conclude with two remarks about the 
policy relevance of the analysis. First, the 
welfare result rests on the assumption that 
demand curves faced by the monopolist 
generate adequate measures of welfare. This 
condition can fail, for example, when the 
monopolist is selling to distorted intermedi- 
ate-good markets rather than to final con- 
sumers. Consider an input monopolist sell- 
ing at a uniform price to several unrelated 
intermediate-good industries. Suppose that 
in equilibrium the proportional price-cost 
markups are different in the various indus- 
tries due to different degrees of competition 
(rather than different demand elasticities). 
Then, allocating a given quantity of the in- 
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put through uniform pricing does not maxi-. 


mize welfare for that input quantity; lower 
input prices should be charged to the indus- 
tries with the higher markups. If price dis- 
crimination by the input monopolist results 
in such a pattern, then welfare can be higher 
under discrimination even if the total quan- 
tity of the input is lower. (Such desirable 


discrimination might be profit-maximizing. 


for the monopolist if, for instance, those 
industries with the higher markups also have 
greater ability to substitute in production 
away from the monopolist’s input.) That is, 
price discrimination by the input monopolist 
could help counteract the downstream dis- 
tortions. This is a standard second-best am- 
biguity. 

The second remark concerns the informa- 
tion needed for my result and for those of 
Schmalensee (1981) and Varian (1985) to 
provide useful welfare tests in practice (as- 
suming that areas under demand curves do 
accurately reflect welfare). What must the 
policymaker ‘know in order to infer that 
welfare is lower under discrimination if out- 
put is observed to be lower? My proposition 
requires the policymaker to'be- confident 
that the monopolist knows demand and 
cost and that the output observed under 
uniform pricing, q”, is profit-maximizing. 


"~ 


DECEMBER 1990 


Schmalensee (1981) and Varian (1985) re- 
quire only that marginal cost at q” be less 
than price {q" need not be profit-maximiz- 
ing, because of the monopolist’s imperfect 
knowledge about ‘cost and demand), pro- 
vided the. policymaker knows also that 
marginal cost is nondecreasing at lower out- 
puts. Thus, more information is required for 
the monopolist but less for the policymaker: 
the policymaker must know only that the 
monopolist possesses the requisite informa- 
tion needed to maximize profit under uni- 
form pricing. 
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The Measurement of International Trade Related to 
_ Multinational Companies 


By F. Stes HIPPLE* 


Multinational companies (MNCs) are in- 
volved in a significant. share of U.S. interna- 
tional trade. Accord:ng to the Bureau of 
Economic Analysis, U.S.-based multina- 
tional companies were associated with 80 
percent of export trede and 40 percent of 
import trade in 198s (Obie G. Whichard, 
1988, p. 87). These eggregate trade shares, 
however, do not necessarily measure the 
level of U.S. trade tnat is uniquely related 
to MNC operations. This paper will present 
and analyze four alternate concepts of the 
definition of MNC-related trade. 

Multinational companies evolve from do- 
mestic firms that expand their international 
activities beyond importing and exporting. 
These firms make equity investments abroad 
and acquire foreign subsidiaries and afli- 
ates. The emergence of multinational com- 
panies has facilitated the evolution of the 
United States to a more “open” economy 
where international trade is a major and 
growing element it production and con- 
sumption. 

The main source of data and analysis on 
the activities of mictinational companies is 
the U.S. Bureau of Economic Analysis 
(BEA), part of the Department of Com- 
merce, The bureaa defines the MNC as 
being a firm based in one country (the 
“narent”) with at least a 10 percent equity 
interest in a firm Iccated in a second coun- 
try (the “affiliate” . Information on these 
firms is provided “hrough periodic bench- 
mark surveys of direct foreign investment 
and annual repors on MNC operations 
(Whichard, p. 86). The amount and quality 
of BEA data on ‘nultinational firms have 
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increased as MNCs have become more sig- 
nificant in U.S. international trade and in- 
vestment. The set of historical data, how- 
ever, presents several major problems in 
addition to the definition of what consti- 
tutes MNC-related trade. 

First, the detailed data on affiliate opera- 
tions are not comparable. The data on the 
foreign affiliates of U.S.-based multination- 
als focus on the majority-owned affiliates 
(50 percent equity interest or more). The 
data on the U.S. affiliates of foreign-based 
MNCs focus on all affiliates (10 percent 
equity interest or more). Second, data are 
available on the parent firms of U.S. multi- 
national companies; no data are available 
on the parent units of foreign multination- 
als. 

Third, the very important benchmark sur- 
veys have been conducted at irregular inter- 
vals, and U.S. and foreign MNCs have never 
been surveyed during the same year. The 
final problem concerns the timing of the 
data. The MNCs covered in the benchmark 
surveys and annual samples have always re- 
ported on a fiscal-year basis. This means 
that comparisons with calendar-year data 
are never exact, although such comparisons 
are routinely performed (Whichard, 1988, 
p. 87). 

This paper will examine four different 
definitions of the trade role of MNCs, in- 
cluding the BEA definition. For comparabil- 
ity, the paper will develop trade data for 
foreign multinational firms and for U.S. and 
foreign firms combined. The analytical pro- 
cedure used in this paper combines a trade 
matrix and flow-of-funds approach to struc- 
ture the trading activities of multinational 
companies. The trade data for U.S. and 
foreign multinational companies are based 
on information in the benchmark surveys of 
foreign direct investment. 
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I. Multinational Companies and Trade Flows 


The trade flows that link the United States 
and other countries are composed of many 
individual transactions. These transactions 
involve buyers and sellers, domestic compa- 
nies and foreign companies, and the domes- 
tic and foreign units of multinational com- 
panies. A trade matrix can effectively show 
the interlinkages among these different types 
of trade transactors (Hipple, 1982, 1989). 

Following the approach of the Bureau of 
Economic Analysis, divide the world into 
the United States (U.S.) and the rest of the 
world (ROW). Assume three types of trade 
transactors in. the world: parent companies 
p, affiliated foreign companies a, and other 
companies o. The parent firms in the United 
States control the affiliates in the rest of the 
world. The ROW parent firms control the 
affiliates in the United States. The “other” 
group of companies are domestic firms that 
engage in international trade. 

The “domestic” affiliates of MNCs in the 
U.S./ROW world will be divided between 
the parent and other groups. The majority- 
owned, domestic affiliates will be counted 
by the BEA as part of the parent group, 
while minority-owned, domestic affiliates (10 
percent equity interest or more) will fall 
into the other group of companies. For ex- 
ample, a majority-owned French affiliate of 
a German-based MNC will be counted as 
part of the ROW parent; a minority-owned 
Dutch affiliate of the same MNC will be 
counted as part of the ROW other group. 

Three sets of trade transactors are lo- 
cated in the United States: (1) the parent 
firms of U.S.-based MNCs, (2) the affiliates 
of foreign-based MNCs, and (3) other U.S. 
companies. Three sets of trade transactors 
are located in the rest of the world: (1) the 
parent firms of foreign-based MNCs, (2) the 
affiliates of U.S.-based MNCs, and (3) other 
ROW companies. 

Total U.S. exports X may be shown as 


(1) X = Xp + Xa+ Xo. 


The terms Xp, Xa, and Xo represent ex- 
ports sold by U.S. parent firms, the affiliates 
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of foreign-based MNCs, and other U.S. 
companies, respectively. 

The destination of U.S. exports can be 
shown as 


(2a) Xp = Xpp + Xpa+ Xpo 


(2b) Xa = Xap + Xaa + Xao 


(2c) Xo = Xop + Xoa+ Xoo. 
The first lowercase létter represents the 
US.-located seller, and the second lower- 
case letter represents the ROW-located 
buyer. For example, the term Xpo is the 
value of exports sold by U.S. parent compa- 
nies to the other group of foreign (ROW) 
companies. 

Similarly, total U.S. imports M can be 
shown as “ 


(3) M = Mp + Ma + Mo. 


The terms Mp, Ma, and Mo represent im- 
ports purchased by U.S. parent firms, thé 
affiliates of foreign-based MNCs, and other 
U.S. companies, respectively. 

The source of U.S. imports can be shown 
as 


(4a) Mp = Mpp + Mpa + Mpo 


(4b) Ma=Map+ Maa + Mao 


(4c) Mo = 


The first lowercase letter represents the 
U.S.-located buyer, and the second lower- 
case letter represents the ROW-located 
seller. For example, the term Mpo is the 
value of imports bought by U.S. parent 
companies from the “other” group of for- 
eign (ROW) companies. 

Certain categories of trade transactors 
have a special interest. The pa and ap 
categories represent intrafirm transactions 
by American-based and foreign-based 
MNCs, respectively. All other transactions 
represent “arms-length” relationships. The 
pp, aa, and oo categories represent over- 
laps (all ‘‘arms-length”) among the three 
types of trade transactors. 


Mop + Moa + Moo. 
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II. Four Definitions of MNC-Related Trade 


Overall, there are nine categories of 
paired buyers and sellers in the matrix 


equation systems in (2) and (4). The four. 


definitions of trade rezated to multinational 
companies may be shown as different sets of 
the nine paired categcries. 

1. Parent and Affiiate—Under this defi- 
nition, MNC-related trade includes all trade 
transactions where the parent firm or the 
affiliate participates as a buyer and/or 
seller. In terms of the matrix equations, it 
would involve all the eight categories with p 
and/or a. Under thas nearly all-inclusive 
definition, only the 20 category of trade 
transactions is excluded. This definition of 
MNC-related trade is used by the Bureau of 
Economic Analysis tc calculate the level of 
merchandise trade associated with U.S. 
multinational comparies (Whichard, p. 87). 

2. Parent Alone—All trade transactions 
in which the parent firm participates as a 
buyer or seller are considered as MNC- 
related trade. In terms of the matrix equa- 
tions, it would inclucle the five categories 
with p. This definition has never received 
any Official use by the Bureau of Economic 
Analysis, but the data have been published 
for U.S. multinationel companies in the last 
benchmark survey of U.S. direct investment 
abroad (U.S. Bureau of Economic Analysis 
[BEA], 1985, pp. 152, 155). 

3. Affiliate Alone—This definition fo- 
cuses ‘on the trading activities of the affili- 
ated firms. All trade transactions m which 
the affiliate company participates as a buyer 
or seller are countec as MNC-related trade. 
In terms of the marrix equations, it would 
include the five categories with a. The act 
of investing in anotier country creates the 
multinational company, and the trade activi- 
ties of the affiliate distinguish the MNC 
from other companies involved in trade 
transactions (Hipple, 1982, 1989). A major 
advantage of this definition is that trade 
data are available for the affiliates of both 
U.S. and ROW multinational firms. 

4. Intrafirm Shipments—Under this def- 
inition, only transactions between the par- 
ent and the affilate are considered as 
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MNC-related trade. In terms of the matrix 
equations, it would include only the two 
categories with pa and ap. This is the most 
narrow definition of MNC-related trade, and 
it focuses on the distinction between 
“arms-length” and “intrafirm” transactions. 
The transactions between a parent and af- 
filiate are not set by market forces or valued 
at market prices but represent the produc- 
tion and distribution operations of the verti- 
cally integrated multinational company. The 
values of the transactions are transfer prices 
set for internal MNC accounting purposes 
(Hipple, 1982, 1989). As in the previous 
definition, trade data are available for both 
U.S. and ROW companies. 


OI. U.S. Merchandise Trade Flows in 1982 


Table 1 is a flow-of-funds table based on 
the matrix equation system and shows U.S. 
merchandise trade data for 1982. Trade 
transactors located in the United States 
form the row headings; trade transactors 
located in the rest of the world form the 
column headings. The trade flows are dis- 
played in “cells” containing U.S. exports, 
imports, and the resulting trade balance. 
The cell labels identify the paired categories 
of trade transactors in equations (2) and (4), 
and the row and column totals. 

The trade data are based on the bench- 
mark surveys of foreign direct investment 
conducted by the U.S. Bureau of Economic 
Analysis. The benchmark surveys collect, 
among other information, detailed data on 
the exports and imports of U.S. and foreign 
multinational companies. As described 
above, more data are available on the oper- 
ations of U.S. multinationals. 

The trade flows shown in Table 1 are 
taken from two different benchmark sur- 
veys. A survey of foreign direct investment 
in the United States was conducted for 1980, 
and a survey of U.S. direct investment 
abroad was conducted for 1982 (BEA, 1983, 
1985). These are sometimes referred to as 
“outward investment” and “reverse invest- 
ment,” respectively. In the tables, the 1980 
data from the reverse investment survey 
have been rescaled to 1982 trade levels. In 
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TABLE 1—-MNC-RELATED TRADE FLows IN 1982 U.S. Exports AND IMPORTS 
(BILLIONS oF DoLLARs) 


ROW 
Firms Located Parent 
in the United States Firms 
US. Parent Firms (pp) 
Exports $70.803 
Imports 65.615 
Balance 5.189 
ROW Affiliate Firms (ap) 
Exports $20.186 
Imports 47.621 
Balance — 27.435 
Other U.S. Firms. (op) 
Exports 50.459 
Imports 45.637 
Balance — 45.178 
Total Trade (Tr) 
Exports $91,448 
Imports 138.873 
Balance — 67,424 
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Firms Located 
in the Rest of the World 
US. Other 
Affiliate ROW Total 
Firms Firms Trade 
(pa) (po) (pT) 
$46.559 $35.863 $153.225 | 
41.598 3,748 110.961 
4.961 . 32.114 42.264 
_ (aa) (ae) | (aT) 
$2.097 $27.932 $50.215 
1.849 27.318 76.788 
0.249 0.613 — 26.573 
(oa) (oa) (oT) 
$8.062 $0.232 $8.753 
7.959 2.607 + 36.203 
0.102 — 2,375 — 47.450 
(Ta) (To) (TT) 
$56.718 $64.027 $212.193 
51.406 33.673 243.952 
5.312 30.353 — 31.759 


Note: The terms in parentheses identify cells in the trade matrix on the basis of 
equations (2) and (4). A T is a total of the row or column. The figures in bold type are 
taken from the BEA benchmark surveys of direct foreign investment or other official 
sources. All other figures are derived under various assumptions. See the Appendix , 


for the sources and derivation of each cell. 


this manner, the trading activities of both 
U.S. and foreign MNCs can be compared 
and measured. 

In Table 1, six cells are in bold type, and 
ten cells are in regular type. The data.in 
bold type are taken from the ‘benchmark 
surveys and other official sources. The data 
in regular type are derived under various 
assumptions. The Appendix describes the 
derivation of the trade data shown in each 
cell. : 

The trading activities of multinational 
affiliates are shown by a row and a column 
of cells forming a “cross” in the flow-of- 
funds matrix (Hipple, 1989). The row of 
cells ap, aa, ao, and aT ‘shows the trade 
activities of the U-S.-located affiliates of for- 
eign-based MNCs. Similarly, the column of 
cells pa, aa, oa, and Ta shows the trade 
activities of the ROW-located affiliates of 


U.S.-based MNCs. Data cell aa shows the 
overlap in the trade activities of the two 
groups. of companies, which must be taken 
into account to avoid double counting. 

The affiliate “cross” and cell TT contain 
five of the six cells in bold type since the 
benchmark surveys focus primarily on af- 
filiate operations. The affiliate data cells in 
regular type (aa, ao, and oa) are a simple 
allocation of residuals based upon a con- 
stant market share assumption. - 

The additional cell in bold type (cell pT) 
provides the total imports and exports asso- 
ciated with U.S. parent firms. The imports 
and exports with affiliates are known (cell 
pa), but there is no rule to allocate the 
residual into the two corner cells (pp and 
po). A constant market share -assumption 
was used to derive the. missing cells in the 
affiliate “cross” since data were available on 
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TABLE 2—U.S. TRADE ORIGINATED BY MULTINATIONAL COMPANIES IN 1982 
(VALUES IN BILLIONS oF DOLLARS; SHARES IN PERCENT) 
Definition of MNC-Related Trade 
DEF 1 DEF 2 DEF 3 DEF 4 
Parent and Parent Affiliate Intrafirm 
Affiliate Alone Alone Shipments 
United States MNCs 
Export Value $163.384 $153.225 $56.718 $46.559 
Import Value 120.769 110.961 51.406 41.598 
Balance 42.615 42.264 5.312 4.961 
Export Share 77.0 percent - 72.2 percent 26.7 percent 21.9 percent 
Import Share 49.5 45.5 21.1 17.1 
Overall Share 62.3 57.9 23.7 19.3 
Rest of World MNCs 
Export Value $121.477 $91.448 $50.215 $20.186 
Import Value 138.040 158.873 76.788 47.621 
Balance — 56.562 — 67.424 — 26.573 — 27.435 
Export Share 57.2 percent 43.1 percent 23.7 percent 9.5 percent 
Import Share 77.1 65.1 31.5 19.5 
Overall Share 67.9 54.9 27.8 14.9 
Combined MNCs 
Export Value $211.961 $173.870 $104.836 - $66.745 
Import Value 241.345 204.219 126.345 89.219 
Balance — 29,384 — 30.349 — 21.510 — 22.474 
Export Share 99.9 percent 81.9 percent 49.4 percent 31.5 percent 
Import Share 98.9 83.7 51.8 36.6 
Overäll Share 99,4 82.9 50.7 34.2 
Total U.S. Trade 
Export Value $212.193 
Import Value 243.952 
Balance — 31.759 


Note: The export and import values under each definition are calculated from the trade matrix cells in Table 1. See 
the Appendix for the calcalation formulas. The export and import trade shares are calculated by dividing the trade 
values by total U.S. expo ts and imports. The overall share is a trade-weighted average of the export and import 


shares. 


the market shares cf both U.S. and ROW 
affiliate companies. lowever, no equivalent 
information exists on the market shares of 
parent companies. Data exist for the U.S. 
parents but not for -he ROW parents. 

A “heroic” assurmption is needed to fill 
all the data cells in Table 1. The only data 
on market shares fcr MNC parents refer to 
U.S.-based comparies. The “heroic” as- 
sumption is that ROW parents play the 
same role in ROW trade as U.S. parents 
play in U.S. trade. That is, the foreign par- 
ent companies have the same trade share of 
ROW imports and exports as U.S. parents 
have of U.S. imports and exports. Two 
points must be noted. First, the trade shares 
refer to nonaffiliate trade, and second, U.S. 


exports (imports) are the ROW imports (ex- 
ports). With this assumption, the residual 
from cells pT and pa can be allocated into 
cells pp and po. A similar procedure fills 
cells op and oo. Then cells Tp and To can 
be filled by summing the columns. 


IV. Trade Related to Multinational Companies 


Table 2 shows the values and shares of 
MNC-related trade under the four defini- 
tions. Data are shown for U.S.-based MNCs, 
foreign-based MNCs, and the two groups of 
multinational companies combined. The 
derivation formulas are shown in the Ap- 
pendix. Two analytical issues are of interest. 
First, what is the relationship between MNC 
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trade and the trade balance? Second, what 
is the relationship between MNC trade and 
overall trade levels? 

As seen in Table 2, the United States 
recorded a $31.8 billion deficit in merchan- 
dise trade in 1982. Under definition 1 (the 
“official” BEA concept), the trading activi- 
ties of combined U.S. and ROW multina- 
tional companies were associated with a 
$29.4 billion deficit. This reflects the $42.6 
billion surplus from the activities of U.S. 
multinationals, the $66.6 billion deficit from 
foreign MNCs, and a $5.4 billion deficit 
adjustment to reflect the overlapping trade 
(cells pp and aa). Under definition 2, these 
figures are nearly identical. 

It is an easy step to credit MNC-trading 
activities as the cause of the deficit. How- 
ever, aS the share data show, combined 
MNCs accounted for 99.4 percent of U.S. 
trade under definition 1 and 82.9 percent 
under definition 2. The combined MNC 
trade deficit under definition 1 is simply the 
overall U.S. merchandise trade deficit. The 
similarities of the figures under definition 2 
and the 82.9 percent trade share lead to a 
similar conclusion. If the trade role of MNCs 
is broadly defined, then nearly all merchan- 
dise trade transactions will be counted. 

Under definitions 3 and 4, MNC-related 
trade is restricted to a narrower range of 
merchandise trade transactions (Hipple, 
1989). Combined affiliate trade (definition 
3) resulted in a $21.5 billion deficit in 1982, 
while accounting for 50.7 percent of mer- 
chandise trade. This reflects a modest $5.3 
billion surplus from the activities of U.S. 
affiliates, a $26.6 billion deficit from ROW 
affiliates, and a $0.2 billion deficit adjust- 
ment due to overlapping trade (data cell 
aa). Combined intrafirm shipments (defini- 
tion 4) showed a $22.5 billion deficit and a 
34.2 percent share. The shipments of U.S. 
multinationals had a $5.0 billion surplus that 
was offset by the $27.4 billion deficit of 
foreign MNCs. (There is no trade overlap 
under definition 4). The similarities of these 
figures under definitions 3 and 4 is striking. 

Intrafirm transactions (definition 4) are 
nested within affiliate transactions (defini- 
tion 3). The figures under definition 3 are 
the result of intrafirm shipments plus some 
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additional transactions. These additional 
affiliate transactions are “arms-length” and, 
as shown in Table 2, tend to be equal on the 
export and import sides. The affiliates of 
U.S. multinationals have about $10 billion 
in additional exports and imports; the af- 
filiates of ROW multinationals have about 
$30 billion. Thus, the deficit in 1982 from 
affiliate trade (definition 3) appears to be 
the same deficit as from intrafirm trade 
(definition 4). 

These results suggest some findings and 
some questions. First, benchmark survey 
data are needed for the activities of ROW 
parent companies in U.S. merchandise 
trade. The role of multinational companies 
cannot be precisely measured without this 
missing component. The assumption used 
here is “heroic?” but may not be too far 
from the mark. The combined MNC trade 
shares in the range of 90 percent under 
definition 1 and 80 percent under definition 
2 seem reasonable given the known levels of 
trade associated with U.S.-based MNCs. 

Second, the usefulness of the BEA defi- 
nition of MNC-related trade (definition 1) is 
questionable. There is a near identity be- 
tween MNC-related trade and total mer- 
chandise trade. In this context, separate 
identification of MNC-related trade pro- 
vides little information since the multina- 
tionals account for nearly all of the trade. 
The same comments would apply to defini- 
tion 2. 

Third, definitions 3 and 4 usefully analyze 
MNC-related trade as an important compo- 
nent of total merchandise trade. In 1982, 
affiliate trade was about one-half of U.S. 
merchandise trade; intrafirm trade was 
about one-third. Shifts in the levels and 
shares of these types of transactions can 
provide insights into changes in U.S. trade 
performance (Hipple, 1989). Affiliate trade 
and intrafirm trade provide information on 
the changing structure of U.S. trade trans- 
actors and the motives underlying trade ac- 
tivities. 

APPENDIX 
The data shown in the tables have been developed 


from the benchmark surveys of foreign direct invest- 
ment conducted periodically by the U.S. Bureau of 
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Economic Analysis. Benchmark surveys of U.S. direct 
investment abroad have bezn conducted for the years 
1966, 1977, and 1982. Berchmark surveys of foreign 
direct investment in the United States have been con- 
ducted for the vears 1974 and 1980. The foreign direct 
investment surveys are now on a seven-year cycle. The 
next benchmark survey, of foreign investment in ihe 
United States will cover 1987, while the next bench- 
mark survey of U.S. direct mvestment abroad will cover 
1989. Given the normal da'a processing lags, the infor- 
mation from this last set of benchmark surveys will not 
be available-until 1992, 

Table 1 shows U.S. merchandise trade flows for 
1982 and is based on the benchmark surveys of 1980 
and 1982. The trade data from the 1980 survey of 
foreign direct investment in the United States have 
been rescaled to 1982 levels. The trade figures in bold 
type are taken directly from the benchmark surveys (or 
other government sources), while all other data in 
these tables are derived under various assumptions. 
The derivation of individual data cells in Table 1 is 
discussed below. 

Table 2 shows the trad= originated by multinational 
companies under the forr different definitions. The 
trade values and trade skares are based on the trade 
flows in Table 1. The calculation formules for the 
export and import values are shown below. The trade 
shares are the export ard import values divided by 
total U.S. exports and imports. The “overall shares” 
shown in the iable are a trade-weighted average of the 
export and import shares 


Derivation of Cata Cells in Table 1 


(pp) The residual of (pT-pa) must be allocated into 
cells pp and po. It is assumed that the foreign MNC 
parents play the same role in ROW trade as 
U.S. parents play in U.S. trade. Thus, the role of U.S. 
parents in U.S. exports (imports) is replicated by ROW 
parents in ROW exports (imports) that are U.S. im- 
ports (exports). In cel pp, exports are (pTX — 
paX X pTM /(TTM — aTM })), and imports are (pTM — 
paMX pTX /\TTX — aTX)), where p7X is exports in 
cell pT, etc. 

(pa) (BEA, 1985, pp. 129, 133). 

(po) Residual of pT ~(pp + pa). 

(pT) (BEA, 1985, pp. 152, 155). 

(ap) Adjustment factors are used to convert 1980 
data to a 1982 basis. For exports, the factor is 0.962, 
which ts the ratio of 19&2 merchandise exports to 1980 
exports. For imports, the factor is 1.013, which is the 
ratio of 1982 merchandise imports to 1980 imports 
(U.S. Bureau of the Census, 1980, p. 11, 1982, p. 9; 
BEA, 1983, p. 141). 

(aa) The trade flow data for cells pa, ap, aT, Ta, 
and TT are known. The residual amounts (Ta — pa) 
and (aT — ap) must be allocated into cells aa and oa 
and cells aa and ao, respectively. The unallocated 
level of trade within the matrix is TT —(pa + ap). 
ROW affiliates accoun’ for (Ta — pa); thus, the ratio 
(Ta — pa) /(TT —( pa + ap)) is used to allocate (aT — 
ap) into cell aa; or, U.S. affiliates account for (aT — ap), 
and the ratio (aT ~ ap) /(TT —( pa + ap)) is used to 
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allocate (aT — ap) into cell aa. Either procedure will 
result in identical estimates for the trade flows in cell 
aa. 
(ao) Residual of aT —(ap +aa). 

(aT) Adjustment factors are used to convert 1980 
data to a 1982 basis. See the discussion under cell ap. 


(BEA, 1983, p. 141). 


(op) The residual of (oT — oa) must be allocated 
into cells ep and oo. See the discussion under cell pp. 

(oa) Residual of Ta —( pa +'aa). 

(oo) Residual of oT ~ (op + oa). 

(oT) Residual of TT ~(pT + aT). ` 

(Tp) Sum of pp + ap + op. 

(Ta) (BEA, 1985, pp. 127,131). 

(To) Sum of po + ao + ao. 

(TT) (Census, 1982, p. 9). 


Calculation of Trade Values for Table 2 


The export and import trade values shown in Table 
2 are calculated from the trade matrix cells in Table 1. 
DEF 1 is all transactions where a parent or affiliate 
participates as a buyer or seller. DEF 2 is all transac- 
tions where the parent is the buyer or seller. DEF 3 is 
all transactions where the affiliate participates as a 
buyer or seller, DEF 4 is limited to intrafirm transac- 
tions between the parent and affiliate. 

The cells pp and aa represent an overlap between 
the trading activities of U.S. and foreign MNCs. There- 
fore, the export and import values of combined MNCs 
must be adjusted for the overlap. The cell formulas are 
as follows. 


United States MNCs: 
DEF 1 = pp + pa + po+aa+oa 


DEF 2 = pp + pa + po 
DEF 3 = pa + aa + oa 
DEF 4 = pa 

Rest of World MNCs: 
DEF 1 = pp + ap + op + aa + ao 
DEF 2 = pp + ap + op 
DEF 3 = ap + aa + ao 
DEF 4 = ap 

Combined MNCs: 
DEF 1 = pp + pa + po + ap + aa 

+ ao + op + oa 

DEF 2 = pp + pa + po + ap + op 
DEF 3 = pa + aa + oa + ap + ao 
DEF 4 = pa + ap. 
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The Indirect and Direct Substitution Effects 


By Masao OGAKI* 


The classification of two goods as substi- 
tutes or complements by the sign of the 
substitution term defined by John R. Hicks 
(1939) intimately involves the relation of 
each of the two gcods to other goods, as 
was emphasized ty Paul A. Samuelson 
(1974) and Masazo Sono (1961) among oth- 
ers. Hence, Hicks’s definition may lead to a 
counterintuitive classification when a third 
good possesses a strong influence. This de- 
fect of Hicks’s defirition motivated Samuel- 
son to propose an alternative definition for 
substitutes and complements. The purpose 
of this note is not to propose an alternative 
classification, but to characterize the effect 
on the substitution term from a specified 
third good. Actually, the present note ana- 
lyzes this effect from multiple goods. This 
task is important because Hicks’s definition 
is most frequently used in spite of the defi- 
ciency. 

As discussed in Section I, it is possible to 
give an intuitive argument as to how the 
classifications of two goods are affected by a 
third good. The main goal for my character- 
ization of the effect from a third good given 
in Section II is to provide the intuitive argu- 
ment with precise and quantitative content. 
The effect is characterized by a further de- 
composition of the substitution term: given 
a third good, the substitution term will be 
decomposed into what IJ call the direct and 
the indirect substitution terms, the former 
being free from tke effect of the third good, 


*Department of Economics, University of Rochester, 
Rochester, NY 14627. I thank Lars Hansen. Lionel 
McKenzie, Jose Schenkman, participants of the Sec- 
ond Buffalo-Cornell-Rochester Ccnference, and two 
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grateful to Mahmoud El-Gamal, John Heaton, Dean 
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Sundram, and especially Sherwin Rosen for sugges- 
tions which improved the exposition and to Kenjiro 
Hirayama, Noboru K:yotaki, Robert Lucas, and Kimi- 
nori Matsuyama for conversations which motivated this 
work. 
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and the latter characterizing the effect. This 
characterization enables one to verify 
whether the given third good causes two 
goods to be substitutes (complements) when 
the two goods would be complements (sub- 
stitutes) if the effect of the third good were 
eliminated. Section III gives an empirical 
illustration. Previously (Ogaki, 1989), I ap- 
plied the concept of the indirect and direct 
substitution effects to theoretical work in 
international financial economics. 


I. An Intuitive Argument 


Samuelson (1974 p. 1255) offered an illu- 
minating example: 


...sometimes I like tea and cream... I 
alse sometimes take cream with my 
coffee. Before you agree that cream is 
therefore a complement to both tea 
and coffee, I should mention that I 
take much less cream in my cup of 
coffee than I do in my cup of tea. 
Therefore, a reduction in the price of 
coffee may reduce my demand for 
cream, which is an odd thing to hap- 
pen between so-called complements. 


Though Samuelson treats the uncompen- 
sated price change here, it is obvious that 
this example is also applicable to the com- 
pensated price change. Thus, coffee and 
cream may be classified as substitutes rather 
than complements in Hicks’s definition. 
This example can be explained as follows. 
Suppose that a compensated price reduc- 
tion in the price of coffee is experienced by 
a consumer. Then there exist two kinds of 
effects which work in opposite directions on 
the demand for cream. One kind of effect 
works directly: since the consumer tends to 
consume coffee and cream together, the de- - 
mand for cream is increased. The other 
kind of effect works indirectly via the de- 
mand for tea: he now demands less tea, 
since coffee and tea are substitutes, and less 
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consumption of tea leads to less demand for 
cream. We may call the former the direct 
substitution effect and the latter the indirect 
substitution effect. Though coffee and cream 
are direct complements, coffee and cream 
are indirect substitutes with respect to tea, 
as the argument above shows. If the indirect 
substitution effect is greater than the direct 
substitution effect in absolute value, coffee 
and cream are substitutes in Hicks’s sense. 

In the example, coffee and cream are 
indirect substitutes, because cream is a 
complement of a substitute of coffee, namely 
tea. Similar intuitive argument can be em- 
ployed to show that a substitute of a substi- 
tute is an indirect complement and that a 
complement of a complement is an indirect 
complement. 


Il. Definition and Properties of the Indirect 
and Direct Substitution Effects 


In order to give the intuitive argument in 


the last section a precise content, I propose 


a definition of the direct and indirect substi- 
tution terms. Suppose there is a compen- 
sated price change in coffee, and consider 
the change in demand for cream. In order 
to remove the effect of a third good, say tea, 
consider the change in demand for cream 
when the consumption of tea is kept con- 
stant. This change is the direct substitution 
effect between coffee and cream with re- 
spect to tea. The difference between Hicks’s 
substitution effect and the direct substitu- 
tion effect may be called the indirect substi- 
tution effect.! The indirect substitution ef- 
fect characterizes the effect of the specified 
third good on the substitution effect. The 
usefulness of the decomposition depends on 
the two properties given below. Since the 
direct substitution effect is nothing but 
the substitution effect under “straight” or 
“specific commodity” rationing, properties 
of both the direct and the indirect substitu- 
tion effects are closely related with results 
in the literature of rationing. 


‘The direct and indirect substitution effects are 
defined with respect to a specified third good. How- 
ever, I omit the phrase “with respect to...” when the 
third good is clear by context. 
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A. General Preferences 


Consider a consumer facing n goods. 
When the consumer is unconstrained, the 
expenditure function takes the form 


(1) M( Pas Py tt) 


= inf | pix, + p§x,;v0(x,,%,) >u] 


Xa ky 


where x= (Xis... Xp) Xp = Opa Xn) 
Pa = (Pi; Pads Py =(Pki:--> Pr) and 
v(x, X») is a utility function. When the 
consumer is constrained to consume. x, of 
X,, an expenditure function may be defined 
as 


(2) m(X,, Pasu) 


= inf | pixa; (Xa p) = U]. 
a 


It is assumed that m(p,, Pp, u) is twice con- 


tinuously differentiable with respect to 
(Pa D) in a nonempty set T of R"*!, so 
that the infimum for m(p,, Pp, u) is uniquely 
attained by ôm /dp; = x7(p,,P,,u) for i= 
1,... 7. Here x/(p,, Pp, u) is compensated 
(or Hicksian) demand function for the ith 
good. It is also assumed that m(X,, p,,u) 
evaluated at 


X, = Xp( Pas Posl) 
= [XE 1 Par Pott) +025 Pas Post] 


is twice continuously differentiable with re- 
spect to (X,, Pa) in T. This assumption im- 
plies, among the other things, that the in- 
fimum for m(xf(p,, Pp, U), Pas 4) is uniquely 
attained by ôm /ðp; = X°(X,, pa, 4) for i= 
1,2,...,k in T. Note that the compensated 
constrained demand function for the ith 
good, £X, p,,4), does not depend on pẹ. 
Let l 


(3) SiC Pas Po» u) = Ox5 /ðp;( Pas Poo u) 


(i,j=1,...,n) 


YOL. 80 NO. 5 


be the substitution term.” Define the direct 
substitution term with respect to the (k + 
Dth,...,(n —1)th, and nth goods by 


(4) Ss? (Pas Pp Ut) 
= O55 / p;i XEO Pas PorU), Parl) 


(i,j=1,...,k). 


It should be noted hat the partial deriva- 
tive is evaluated at =, = xf(p,, Dp, 4). Oth- 
erwise, it would not Je possible to interpret 
S$ and Si (defined 2elow) as a decomposi- 
tion of yA The incirect substitution term 
with respect to the (k + Dth,...,(2 — Dth, 
and nth goods is deined by 


(5) Sij( Das Posu) 
Z Si;( Daou) = SHC Pas PoU) 


(i.j=1,..., k). 
Let S° =[S}];;-1,.. œ be the matrix of di- 
rect and S =[Si. I =: be the matrix of 
indirect substitutior terms. 

It is easily seer that S° is symmetric 
and negative sem definite because Yy = 
ðm /dp,ap,;. The “act that S' is negative 
semidefinite follows from the general prop- 
erties of restrictec expenditure functions 
and is a special case of a Le Chatelier-type 
result proved by J. P. Neary and K. W. S. 
Roberts (1980 p. 3s). 


Property 1: For any (p,,p,,4) €T, the di- 
rect and indirect substitution matrices, S4 
and SÍ, are symmetric.:and negative 
semidefinite. 


Because of the syrametry property, the fol- 
lowing terminology is legitimate. If the di- 
rect (indirect) subs itution term between two 
goods is positive, tae goods are called direct 
(indirect) substitutes; if the term is negative, 


*For convenience, Š; is defined as function of 
(Pa Phu). On substitu_ing the indirect utility function 
to u, S;; can be regar-Jed as a function of the prices 
and income. 
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they are called direct (indirect) comple- 
ments. : . 

Let Sy =[Sy];-k+1,...n De an (n—k)x1 
vector and let SylSijJ;j2e+1,...,.2 De an 
(n—k)X(n—k) matrix of ‘substitution 
terms. 


Property 2: For any (pa, py,wET, Sù = 
Si (S7 JS ib for 1<i<xk andl<j<k if 
Spb 1s nonsingular. 


PROOF: 
Differentiating the identity _ 


(6) Xi ( Pas Pps u) 


= s(x Pas Pus tt), Past) 


(i=1,...,k) 
with respect to p; ( sa ., k) yields 
(7) S; = 3 Sa 87 se 
g=k+1 Xq 
(G= 1.0K). 
Differentiating (6) with respect to p; (j= 
k +1,...,n) yields 
n ag 
BO ys E Sy 
, i q=k+1 "OX, 
(j=kK+1,...,m). 


Combining (5), (7), and (8) completes the 
proof. 


In the special case where there is only one 
third good (k +1 = n), equation (8) is noth- 
ing but James Tobin and H. S. Houthakker’s 
(1950-51) equation 3.8, and equation (7) is 
their equation 5.2. 

Property 2 shows that the indirect substi- 
tution term between coffee and cream with 
respect to tea equals the substitution term 


An anonymous referee suggested this proof, which 
is much simpler than the proof given in the earlier 
versions of the present note. The proof is an applica- 
tion of Robert A. Pollak’s (1969) theory of conditional 
demand functions. 
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between coffee and tea multiplied by the 
substitution term between tea and cream 
divided by the own substitution term of tea. 
This property may be called the sign prop- 
erty of the indirect substitution effect: since 
the own substitution term of tea is always 
negative, the sign of the indirect substitu- 
tion term between tea and coffee is deter- 
mined by those of the ‘substitution term 
between coffee and tea and the substitution 
term between tea and cream. It follows from 
the sign property that a complement of a 
substitute is an indirect substitute, that a 
substitute of a substitute is an indirect com- 
plement, and that a complement of a com- 
plement is an indirect complement. Thus, 
the definition formalizes the intuitive argu- 
ment given in the last section. Property 2 
also makes it possible to calculate the direct 
and the indirect substitution effects from 
only the knowledge of substitution terms. 
Thus, whenever an empirical researcher ob- 
tains a strange result about the substitute- 
complement relationship of two goods, he 
or she can check whether some third good 
renders the two goods substitutes or com- 
plements through the indirect substitution 
effect. 

Note that the Property 2 also holds in the 
elasticity form. Let e,,=S,,p,/x; be the 
compensated cross-price elasticity of the jth 
pie on ou ith good. Define ef = 


Sip; /X}, ej; = S} p;/xf as direct and indi- 
rect cross-price elasticities, respectively. 
Then, 

a ee 


and by (7), (8), and (9), 


(10) 


= zi 
ei 7 et, (ep, Je; 
where 


ej = Leyes <3 


Cip = ype eae 
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are (n-—k)xX1 vectors and e,, 
lei], jon+t,...2 iS an (n—k)x(n-— k) ma- 
trix. 


B. Separable Preferences 


Sono (1961) defined the concept of the 
proper substitution effect in the case of 
weak separability in preferences. It is of 
interest to see the relation between the 
direct substitution effect and the proper 
substitution effect under separability. As- 
sume that x, is weakly separable from the 
goods in x,, and let i(x,) = v(x,, Xp). Then 
V(x s Xp) = f(i(x,), x,) for some function 
f. Suppose that x?(p,,u)=[xP(p,,w),.. 

[x kl Pa U)] solves the problem of minimizing 
pix, subject to #(x,)=u. Then, Sono’s 
proper substitution term is xP / Op; for i,j 

.,k. Obviously, xP( p,p u)= xF( pau) 
an hold. Hence, Sono’s proper substitu- 
tion term is equal to the direct substitution 
term with respect to x, under separability. 
Sono showed that the proper substitution 
term could have a different sign from S,,, 
which Sono called the general substitution 
term. Hence, the indirect substitution effect 
can dominate the direct substitution term 
even under separability. 


HI. An Empirical Illustration 


In order to illustrate the use of our re- 
sults, they are applied to estimates of cross- 
price elasticities given in Angus S. Deaton 
(1974). Deaton estimated c,,=w;,e;;, using 
maximum-likelihood methods for British 
data from 1900 to 1970, where w, is the 
budget share of the ith good. We use the 
budget share w; given in Deaton (1974 
p- 360) to convert estimates of c;; reported 
in his table IH for the symmetric version of 
the Rotterdam model into elasticities, e;; 
Once e;,’s are obtained, indirect and direct 
cross-elasticities with respect to any third 
good can be calculated from relations (9) 
and (10). 

Deaton’s point estimate of the cross-price 
elasticity of food and entertainment (con- 
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TABLE E—INbIRECT AND Direcr Cross-PRICE ELASTICITIES FOR FOOD AND 
EXTERTAINMENT WITH RESPECT TO ALTERNATIVE THIRD Goops ` 

Tiem SE? ey SE? 

Footwear and clothing — 0.063 90.053 0.014 0.079 
Housing — 0.005 0.032 — 0.044 0.021 
Fuel — 0.001 0.044 ~09.048 90.018 
Drink and tcbhacco —0.003 0.033 —0.046 0.018 
Travel and communication — 0.002 0.052 — 0.047 0.023 
Other goods 0.001 0.047 —0.050 0.018 
Other services — 0.001 0.065 — 0.049 0,019 


Note: Estimates reported in table IT] of Deaton (1974) were used to calculate indirect 
and direct c-oss-price elasticities for consumption of food when the price of entertain- 


ment changes. 


“The standard errors reported were calculated with the assumption that the 
correlations between Dezton’s estimates were zero. 


sisting of books and magazines, newspapers, 
and other entertainment) indicates that 
these two goods are complements. This 
might be counterin-uitive, because there 
does not seem to exist strong reason to 
believe that consumption of food and books 
should be increased simultaneously. In 
Table 1, indirect and direct cross-price elas- 
ticities for food and entertainment are re- 
ported for alternative choices of the third 
good. A mean-valLe approximation (the 
delta method) can be used to calculate stan- 
dard errors of the estimates reported in 
Table 1. This calculation requires the co- 


variance matrix of estimates of cross-price ` 
elasticities. Since Deaton did not report the. 


covariance between different estimates, I 
assumed that the covariance was zero in the 
calculation. Hence, the standard errors re- 
ported in Table 1 arz approximations. 

The hypothesis that food and entertain- 
ment are direct sukstitutes can be rejected 
at the 5-percent level for all potential third 
goods examined, except for footwear and 
clothing (clothing fcr short). The point esti- 
mate of the direc’-substitution term sug- 
gests that food and 2ntertainment are direct 
substitutes with respect to clothing. Thus, 
the hypothesis that the classification is due 
to the indirect effect from the third good, 
clothing, cannot be rejected. This is because 
clothing is estimated to be a strong substi- 
tute both for food and for entertainment. 
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Auction Institutional Design: Theory and Behavior of 
simultaneous Multiple-Unit Generalizations 
of the Dutch and English Auctions 


By Kevin A. MCCABE, STEPHEN J. RASSENTI, AND VERNON L. SmitH* 


Historically, English and Dutch auctions 
have been used for the exchange of single 
objects such as works of art or single lots of 
a good such as produce, fish, or cut flowers. 
Where these institutions have been used for 
the exchange of multiple units, such as the 
Australian wool auction (using English 
rules), successive lots of the good are some- 
times sold sequentially at auction. In some, 
but not all, instances this is because the 
goods are not identical, even though the 
various lots may be close substitutes (see 
Penny Burns, 1985). Where the goods are 
accepted universally as being homogeneous, 
as in the securities markets, multiple units 
are often commonly auctioned simultane- 
ously. In the securities industry, orders are 
batched for simultaneous execution in mul- 
tiple-unit auctions in what are referred to as 
“call markets”; that is, the security is 
“called” for auction at a particular point in 
time. This type of market is used on the 
stock exchanges of Austria, Belgium, 
France, Germany, and Israel. Some of these 
are verbal, and some are sealed bid auc- 
tions. 

Although the U.S. organized exchanges 
are predominantly continuous rather than 
call markets (except that call markets are 
used each day to open trading in each listed 
security), there is a growing number of ex- 
ceptions such as the proliferation, since 
1984, of Auction Preferred Stock (Gold- 
man, Sachs and Co., October 1984) and 
Money Market Preferred Stock (Lehman 
Brothers, July 1984). We now have Dutch 
Auction Rate Transferable Securities, called 
DARTS, Stated Rate Auction Preferred 


*Economic Science Laboratory, University of Ari- 
zona, Tucson, Arizona. This material is based upon 
work supported by the National Science Foundation 
under grant no. SES-8320121. 
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Stock, or STRAPS, and many more. After 
the initial subscription offering of this type 
of security, the market is called every 49 
days to reset the preferred dividend rate 
using a multiple-unit auction. The exchange 
of shares and the dividend determination is 
based on the array of stated dividend rates 
at which existing holders and potential new 
holders are willing to sell and/or buy corre- 
sponding quantities. The dividend rate and 
exchange of shares every 49 days is executed 
using the uniform price or competitive 
sealed bid mechanism (Vernon L. Smith 
et al., 1980). The discussion to follow will be 
confined to this sealed bid form of the call 
market. . 

Call markets provide temporal consolida- 
tion of trade orders or other forms of ex- 
pressing the desire to buy and sell. By 
comparison with continuous trading, call 
markets offer both advantages and disad- 
vantages (Robert A. Schwartz, 1988 pp. 
442-6). The cited advantages include low 
cost of operating the exchange; information 
aggregation and presumed pricing effi- 
ciency; price stability; individual trades, 
which are thought to have a small impact on 
price; reduced price uncertainty; and, fi- 
nally, nondiscriminatory pricing. However, 
there are offsetting disadvantages: (1) the 
market is inaccessible except at the time of 
call; (2) no bid, offer, contract, or price 
information is available until the results of 
the call are announced; and (3) there is 
transaction uncertainty because a submitted 
bid (offer) may be too low (high) to execute 
inside the supply-demand cross. These con- 
ditions are only partially alleviated if there 
is a secondary market between calls. 

These disadvantages may be significant. 
In September 1988, the Wall Street Journal 
published an article on the failure of a call 
market for the auction rate preferred stock 
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of Kroger Co. The arcicle lists several other 
call market failures in addition to the Kroger 
issue. 

We are currently researching the theoret- 
ical and behavioral properties of a number 
of new proposed institutions that represent 
alternatives to the sealed bid “call” auction. 
These are exercises -n institutional design. 
Previous experimental research has estab- 
lished that the twc-sided uniform price 
sealed bid-offer auct:on is less efficient and 
yields less competitive prices than does the 
continuous double auction. In fact, no insti- 
tution studied to dat2 is more efficient than 
the latter. Our objective is to explore a 
variety of new institucional designs in search 
of an institution at East as efficient as the 
continuous double auction, that has all of 
the advantages claimed for call markets, 
and that corrects for disadvantages (2) and 
(3) cited above. That is, we seek new trading 
institutions suitable ‘or call market applica- 
tion in which traders receive some form of 
information feedba>k enabling them to 
make desired adjustments and ta reduce 
uncertainty as to wrether they will be able 
to transact. 


I. Institutional Design: Two New Mechanisms 
for Call Market Exchange 


In this paper, we compare multiple-unit 
generalizations of the Dutch and English 
auction institutions when there is an inelas- 
tic supply of four Fomogeneous units of a 
good and ten prosp=ctive buyers. Since the 
offer price is adjusted automatically by clock 
in both of the institutions we study, we refer 
to them as the Datch clock and English 
clock mechanisms. It is common informa- 
tion that the buyers desire at most one unit 
of the good and tha: the buyers’ reservation 
values have been crawn randomly from a 
uniform distributior over 1-cent discrete in- 
crements from [0, 274]. 


‘Subjects for the experiment were recruited from 
the undergraduate popwation at the University of Ari- 
zona. They were paid three dollars af the beginning of 
each experiment as an ircentive to show up. At the end 
of each experiment subjects were paid their salient 
earnings in U.S. dollars. 
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This study is motivated by the seminal 
work of William Vickrey (1961, 1962, 1976) 
and the objective of overcoming some of the 
disadvantages of the sealed bid form of call 
market organization. We also extend the 
experimental research on single- and multi- 
ple-unit auctions. For a single-unit environ- 
ment, Vicki M. Coppinger, Vernon L. Smith, 
and Jon A. Titus (1980) find that their ex- 
perimental data are consistent with some of 
the predictions made by William Vickrey 
(1961). Thus, mean prices are statistically 
the same for English and Dutch auctions; 
the variance in Dutch prices is less than the 
variance in English prices. James C. Cox, 
Bruce Roberson, and Vernon L. Smith 
(1982) generalize the single-unit Vickrey 
model to allow heterogeneous risk aversion. 
This generalization explains the lower effi- 
ciency of the Dutch auction when compared 
to the English auction. 

Our proposed multiple-unit Dutch and 
English clock mechanisms are shown to have 
the same theoretical properties as the cor- 
responding single-unit Dutch clock and En- 
glish oral auctions. We find that most of the 
qualitative differences between Dutch and 
English single-unit auctions extend to 
nondiscriminatory multiple-unit versions of 
these institutions. 

Each experiment consisted of 22 auctions, 
all of which used either the Dutch or En- 
glish institutional rules.” Each auction was 
run as follows. First, each subject was as- 
signed a private resale value on a Plato 
terminal. After each subject had been as- 
signed a value at random, the auction was 
begun. The auction took place at the front 
of the room with one of the experimenters 
acting as auctioneer. When the auction was 
over, the common market price for the good 
and the actual winners were announced. 
This information was then recorded on each 
subject’s Plato terminal. In addition, each 


*Once subjects had all arrived, they were given a set 
of written instructions for the auction institution used 
that day. Subjects were asked to read the instructions 
and then listen to a brief example presented by one of 
the experimenters. After the instructions were finished 
and any questions answered, the experiment was be- 
gun. 


1278 THE AMERICAN ECONOMIC REVIEW 


subject’s profit was displayed privately on 
the terminal. After each subject had exam- 
ined the auction outcome and his private 
profit, the next auction period was begun. 

If a subject did not win a unit, he made 
zero profit; otherwise, a subject’s profit was 
the difference between his resale value and 
the market price. Subjects could make nega- 
tive profits in any period, but total profit in 
the experiment was constrained to be non- 
negative. This constraint was never binding. 
Subjects were paid cash equal to three times 
their total profit (rounded up to the nearest 
quarter) at the end of the experiment; to all 
subjects this was common information. 


If. English Clock: Institution and Theory 


The English clock is set initially at zero, 
and all buyers who wish to buy at this price 
are asked to raise their identification cards. 
The clock then begins to increment by 5 
cents (until five subjects are active and then 
by 1 cent). As long as buyers wish fo stay in 
the auction, they keep their cards raised. 
Once a buyer lowers his card, he is out of 
the auction. The clock continues to rise 
until only four buyers remain. As soon as 
the fifth buyer drops out of the auction, the 
clock is stopped and the market price paid 
by the remaining four winners is the price 
on the clock. (See Ralph Cassiday [1967] for 
discussion ‘of an experimental English clock 
for single object exchange.) 

The theory of the single-object English 
auction generalizes naturally to the multi- 
ple-unit English clock. Each buyer has a 
dominant strategy to stay in the auction as 
long as the clock price is below his or her 
resale value.? When the clock:rises to a 
buyer’s resale value, he or she is motivated 
to drop out of the auction. The predicted 
equilibrium price is the fifth highest resale 
value. The auction is Pareto optimal since 


This theory is based on the standard assumption 
that values are drawn from a density function and 
therefore ties have zero probability. In fact, of course, 
draws are from a discrete mass function, but the proba- 
bility of a tie is only 1 in 50,625. 
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the players with the four highest resale val- 
ues are predicted to each win a unit of the 
good. 


II. Dutch Clock: Institution and Theory 


In our experiments, the Dutch clock is set 
at 225, and all buyers who wish to buy at 
this price are asked to raise their ID cards 
and keep them up. The clock then begins to 
decrement by 5 cents until three subjects 
have their cards raised, then by 1 cent. The 
clock continues to fall until four buyers have 
their cards raised. At this point, the clock is 
stopped and the market price paid by the 
four winners is the price on the clock. 

Consider a theoretical version of this in- 
stitution in which an analogue clock drops 
continuously. Let u; € [0,1] be the value to 
player i of buying a unit. We assume that 
vs are chosen independently from a contin- 
uous uniform distribution; the value v, is 
private information; there are N buyers in 
the auction; there are M < N units to be 
sold; and the buyers are risk-neutral. 

Denote by b;; player i’s reservation price 
for raising his card when j units (M-number 
raised cards) remain to be sold. Vickrey has 
shown that 





(1) bi = 


is a Nash equilibrium bid function when 
M=1. 

Let ay,.--,a@, be an arbitrary set of frac- 
tions that satisfy l>ay>ay_,2°°° 2a, 
=(N— M)/(N — M +1). Then the follow- 
ing is a Nash equilibrium bidding strategy 
for M> 1. 


(2) bë = a;;. 


Equation (2) defines a class of Nash equilib- 
rium bidding strategies since a,,,...,a@, can 
be any order-preserving sequence of pro- 
portions as long as a,>(N—-M)/(N— 
M+1)=a,. 

For convenience, let v; >v, >t: > vy. 
By assumption, the probability of a tied 
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value is zero. Note that, by following (2), 
players 1,..., M —1 will each receive a unit. 
Player M will determine the price at 


N-M 


and also receives a unit: The remaining, 


players M +1,..., N will not receive a unit. 

Given that all other players bid according 
to (2), can any individual do better by bid- 
ding other than (2)? Assume k units remain 
to be sold. First, we consider- the possibility 
of ever bidding less than (2). If k=1, we 
have arrived at: the one-unit endgame, and 
Vickrey has shown that bidding a,v is opti- 
mal. When k >1, no player who would not 
otherwise win before the endgame can help 
himself by bidding less than a,v. For any 
player who would otherwise win before. the 
endgame, bidding less than a,t cannot im- 
prove his price unless he consistently bids 
low enough to force himself into the 
endgame. But that never pays off since he 
would replace a lower-valued player as the 
price setter and set a higher price a,v for 
himself. 

Now let us consider the possibility of bid- 
ding more than suggested by (2). An indi- 
vidual with value v is trying to maximize his 
expected gain E, from the remainder of the 
auction. M — k players already have their 
cards raised, and N- M + k —1 others re- 
main competitive in the auction. Let a,, Ww, 
be the bid tendered by the most recent 
(M — kth) player to raise his card. Then the 
values of the remaining competitors are dis- 
tributed uniformly on [0, w,]. Now let us 
describe his'expected return as a function of 
his reservation price p, for raising his card 
when k units remain to be sold: 





+f Ex 


(N-M+1) Pr 
50-3 naan e k=l 
i Pf ag. ` 


(N—M+k) oy, 


(N-M+k— Dwr) (N -M+ x2) 
Se — nt rte 


avi, +. 
wiN-M+k~1) k-1 
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The first term of (4) represents expected 
gain from being the next player to raise his 
card, while the second term represents his 
expected gain when another card is raised 
first, forcing him to continue playing in the 
k —1 unit subgame. The elements of (4) can 
be described as follows. The expression 
(p, /a,w,)’-“**—! is the probability that 
all remaining N — M + k —1 players bid less 


_ than p, given that their values are uniform 


on [0, w,] and they use the bidding ratio œ}. 
The expression a(N-M+D/N-M+ 
kK) p, /a,] is the expected bid of the price 
setter who has the k —1th highest value of 
the remaining N — M + k —1 players whose 
values are uniform on [0, p, /a@,]. The vari- 
able E*_, is the maximized expected return 
for the k—1 unit subgame he must play 
when someone with a value w,_,> p,/@, 
raises his card first. The expression (N — M 
+ k — wih {Mtk-D / wN M+k-D is the 
density function of the first-order:statistic of 
the remaining N — M + k —1 players whose 
values are uniform on [0, w,]. 

For notational convenience, we now de- 
fine a recursive function R,(w,,v) as fol- 
lows: 


(N-M+k) 
(5) R,=1; R; = k + -= 


x f Reaves, for k>1. 
; v 


By solving the first-order condition of (4), 
dE, / dp; = 0, for the cases k = 2 and 3, we 
can show př = a,v, and the following form 
for E;* results: 


piN-M+&) 


(N= M+ kwh iD Ne 


(6) Eš = 


Taking the first-order condition for the 
general case of (4), with (5) and (6) appro- 
priately substituted, we get the following 
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equation: 
(N+M+k-1) 
(7) 0= on, wiN-M+k~1) 
| Pi i +2) 
Pye, (ems V 
Gk 
(N-M) p,\N-MtE-D 
7 a,w{NM+tk-1) fa 
„(N-M+k-1) 
7 aw N-M+k-D 
G M+k—1) 
= S 
‘D 
: x PPR, 2d, 


An obvious root of (7) is p, = œv since the 
integral term disappears, there is a common 
denominator, and the numerator be- 
comes {((N-M+k-1)-(N- M)- 
(k —Djo"-“+*-D = 0, Thus, by induction, 
no player can do any better at any stage of 
the game by bidding more than æv. Hence 
(2) is a class of Nash equilibria. 

If players are risk averse and can be char- 
acterized by the Arrow-Pratt measure of 
constant relative risk aversion (1—1r;) for 
bidder i, with r; in (0,1] then James C. Cox, 
Bruce Roberson, and Vernon L. Smith 
(1982) have shown that the Nash bidding 
rule for our endgame is 6/6+ 7; of the 
resale value.* Ifthe r; is distinct for each i 
among the seven remaining bidders, then a 
buyer other than the one with fourth high- 
est value could win the fourth unit, thus 
changing subjects’ optimal strategies. 


*This theory is based on the standard assumption 
that values are drawn from a density function and 
therefore ties have zero probability. In fact, of course, 
draws are froma discrete mass function, but the proba- 
bility of a tie is only 1 in 50,625. ° 
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IV. a Data 


Figure. 1 graphs actin markei price mi- 
nus the predicted price for each of the four 
clock experiments. In Table 1 we provide 
comparisons of mean prices and price vari- 
ances. A paired f-test comparing actual En- 
glish prices with predicted English prices 
results in a ¢ value equal to — 2.121, which 
is not significant at the 1-percent level. In 
22 of the 44 auctions the predicted price 
exactly equaled the actual price, while in an 
additional 14 auctions actual price was 
within 1 cent of the predicted price. In all 
but one case the difference is nonpositive. 
(in that case the buyer with the fifth highest 
resale value left the market slightly later 
than predicted.) These results differ from 
the single-unit oral English auction, where 
actual prices tend to be slightly above pre- 
dicted prices. This is explained by the fact 
that Vicki M. Coppinger, Vernon L. Smith, 
and Jon A. Titus (1980) conducted an En- 
glish oral auction (not an English clock auc- 
tion) in which bids were advanced from the 
floor by the bidders. This causes some over- 
bidding, depending upon the size of the 
increment by which the highest-value bidder 
advances the penultimate bid. For a study 
of the serious strategic problems in English 
multiple-unit auctions that are created when 
bidders advance the price from the floor, see 
Kevin A. McCabe, Stephen J. Rassenti, and 
Vernon L. Smith (1988). 

Figure 2 plots the cumulative distribu- 
tions of actual and theoretical Dutch prices. 
A paired t-test in Table 1 (with a + equal -to 
3.963) confirms.that actual prices are signif- 
icantly greater (at the 1-percent level) than 
predicted prices. This is consistent with any 
risk-aversion model of subject behavior. 

As can be observed in Figure 1, the Dutch 
clock tends to create much greater variabil- 
ity in price differences when compared to 
the English clock. However, as predicted by 
the theory, the variance in the raw observed 
Dutch prices’ (858) is less than the variance 
in the raw observed English prices (926). F 
tests in Table 1 show that the differences in 
variances between actual and predicted 
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—— Dutch Clock 
English Clock 


25 30 35 40 


Auction Period 1 — 22 (Exp E1 and D1); 23 — 44 (Exp E2 and D2) 


FIGURE 1. GRAPH OF THE DIFFERENCE BETWEEN THE ACTUAL EXPERIMENT 
PEICE AND THE THEORETICAL PREDICTION IN EACH AUCTION PERIOD. 
For THE DutcH Clock, We USE THE RISK-NEUTRAL PREDICTION 


TABLE 1—COMPARISON OF MEANS AND VARIANCES USING PAIRED t-TrEstrs 
AND F Tests (43 DEGREES OF FREEDOM) 


Series Compared 


Dutch Actcal and Dutch Theoretical 
(Conditicnal on Values) 

English Actual and English Theoretical 
(Conditicnal on Values) 

Dutch Actual and English Actual 


prices are insignificent for both clock auc- 
tions." 

The mean Dutch price is 119 compared 
to 111 for the mean English price. This 


"We calculate an F value of 1.225 for actual versus 
predicted price variances in the Dutch clock experi- 
ments and an F value of 1.026 for actual versus 
predicted price variances in the English clcck experi- 
ments. Neither F is signijcant at the 10-percent level. 


Means, Variances, 
r Value | F Value 
3.963 1.225 
= 2121 1.026 

2.895 1.079 


difference is significant at the 1-percent level 
with a paired-test ¢ value in Table 1 of 
2.895 and is larger than that generally ob- 
served in the single-unit oral auctions, due 
in part, perhaps to our explanation of why 
the English oral auction price is higher than 
the English clock price. However, in the last 
15 auctions the mean Dutch price was 124 
and the mean English price was 120, sug- 
gesting some convergence in this measure. 
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FIGURE 2. CUMULATIVE DISTRIBUTION OF PRICES 
FOR DUTCH AUCTION EXPERIMENTS, COMPARISON 
OF ACTUAL Vs. PREDICTED DuTCH CLOCK 
AUCTION FOR 44 AUCTION PERIODS 


Efficiency can be measured as actual 
buyer surplus (the sum of the winners’ re- 
sale values) divided by the predicted surplus 
(the sum of the four highest resale values). 
We find that the English clock is more ef- 
ficient, with an efficiency of 1.0 in 43 of the 
44 auctions.° The one exception is an effi- 
ciency of 0.9856 in period 5 of experiment 
E1. The Dutch clock achieves an efficiency 
of 1.0 in 28 of the 44 auctions, with an 
efficiency as low as 0.8661 in period 11 of 
experiment D1. By the constant-relative- 
risk-aversion model, this is interpreted as 
implying that subjects have differing param- 
eter values, r,;. 

Finally, we can compare the revenue-gen- 
erating properties of the two clock auctions. 
Figure 3 plots the average revenue for each 


d 


°A paired t-test between Dutch and English surplus 
results in a £ value equal to —3.193, which is signifi- 
cant at the 1-percent level. Furthermore a paired t-test 
comparing the theoretical maximum surplus with the 
English clock surplus is insignificant with a ¢ value of 
1.00. 
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FIGURE 3. AVERAGE REVENUE BY AUCTION 
PERIOD FOR THE ENGLISH AND DUTCH CLOCKS 


clock (two experiments in each clock) for 
each of the 22 auctions. The Dutch clock 
generates more revenue than the English 
clock. Over all 44 auctions, the Dutch clock 
produces an average revenue of 475 cents, 
whereas the English clock produces an aver- 
age revenue of 445 cents. 
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ERRATA 


Lifetime Earnings and the Vietnam Era Draft Lottery: - 
Evidence from Social Security Administrative Records 


By Josnua D. ANGRIST* 


I. Errors in the Figures 


Due to a typographer’s error, the figures 
in my article published in The American 
Economic Review (Volume 80, No. 3, June 
1990, pp. 313-36) were transposed and in- 
correctly labelled. The figure on page 315, 
labelled Figure 1, is actually Figure 3; the 
figure on page 324, labelled Figure 3, is 
actually Figure 1. The title and caption for 
Figure 1 appear under Figure 3, and vice 
versa. A corrected set of figures is repro- 
duced in this note, and the figures are briefly 
described here as well as in the original 
paper. 

Men were called for conscription in the 
Vietnam era draft lottery according to lot- 
tery numbers ranging from 1 to 365 which 


were randomly assigned to dates of birth. . 


Men with lottery numbers below the highest 
number called for induction are referred to 
in the paper as “draft-eligible.” Figure 1 
shows the history of FICA (Social Security) 
taxable earnings for draft-lottery partici- 
pants born between 1950 and 1953. For 
each cohort and race, two lines are drawn: 
one for draft-eligible men and one for men 
with lottery numbers that exempted them 
from the draft. 

Figure 2 presents a magnified view of the 
effect of draft eligibility on earnings. This 
figure plots the time series of differences in 
earnings by draft-eligibility status. The fig- 
ure shows no difference in earnings by 
draft-eligibility status before the year of 
conscription risk (1970-3 for men born 
1950-3), while in subsequent years the earn- 
- ings of draft-eligible men generally fall be- 


*Department of Economics, Littauer Center, Har- 
vard University, Cambridge, MA 02138. 
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low the earnings. of men who could not be 


drafted. 

Figure 3. graphs mean W-2 (federal in- 
come taxable) earnings in 1981-4 by cohort 
and lottery number (V,,;) against probabili- 
ties of veteran status by cohort and lottery 
number (f,,). Earnings are in 1978 dollars. 
Plotted in the figure are the average (over 
four years of earnings) residuals from a 
regression of earnings and probabilities on 
period (6,) and cohort (B8,) effects. Thus, 
the slope of the regression line drawn 
through the points corresponds to an esti- 
mate of @ in 


Ver; T Pe Ei Ô, T p,j@ T U ot; 


which is equation (3) in the paper. Esti- 
mates of equation (3) are equivalent to in- 
strumental variables estimates of 


Yeri = Pe T Ô, T Sia T Uir 


where s; indicates veteran status and the © 
instruments are dummy variables that indi- 
cate lottery number, cohort, and year of 
earnings. The parameter œ represents the 
effect of veteran status on earnings and is 
estimated as — 2,384 dollars with a standard 
error of 778 dollars. 


II. Error in Footnote 7 


The formula given in Footnote 7 for the 
sampling variance of the Wald estimates 
reported in Table 3 is incorrect. The correct 
variance formula is 


(p°- p [D +a] 


where ® is the variance of y° — y” and ¢ is 
the variance of p° — p”. The formula’ used 
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FIGUR= 1. SOCIAL SECURITY EARNINGS PROFILES BY DRAFT-ELIGIBILITY STATUS 


Notes: The figure plots the history of FICA taxable earnings for the four cohorts born 
1950-3. Fər each cohort, separate lines are drawn for draft-eligible and draft-ineligi- 
ble men. Plotted points show average real (1978) earnings of working men born in 
1953, real earnings + $3000 for men born in 1950, real earnings + $2000 for men born 
in 1951, and real earnings +$1000 for men born in 1952. 





WHITES EARNINGS DIFFERENCE 





NONWHITES EARNINGS DIFFERENCE 


66 68 70 72 74 76.78 80 82 84 66 68 70 72 74 76 78 80 82 84 
YEAR YEAR 


BORN 1950 
BORN 1951 


BORN 1952 
BORN 1953 





Ficure 2. THE DIFFERENCE IN EARNINGS BY DRAFT-ELIGIBILITY STATUS 


Notes: The figure plots the difference in FICA taxable earnings by draft-eligibility 
status for the four cohorts born 1950-3. Each tick on the vertical axis represents $500 
real (1978) dollars. 
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FIGURE 3. EARNINGS AND THE PROBABILITY OF VETERAN STATUS BY 
, LOTTERY NUMBER 


Notes: The figure plots mean W-2 compensation in 1981-4 against probabilities of 
veteran status by cohort and groups of five consecutive lottery numbers for white men 
born 1950-3. Plotted points consist cf the, average residuals (over four years of 
earnings) from regressions on period and cohort effects. The slope of the least-squares 
regression line drawn through the points is — 2,384, with a standard error of 778, and 
is an estimate of @ in the equation 


Fej = Be + Ô; -+ P.ja + B oj 


in the paper assumes a= 0. Replacing a eqùivalent to the Wald estimate. As an em- 
with a consistent estimate in the correct pirical matter, use of the correct formula 
formula gives the variance for the two-sam- raises the standard errors in Table 3 by 
ple instrumental variables estimate that is roughly 10 percent. 
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ERRATA 


On the Optimal Tax Base for Commodity Taxation 


By Jonn DouG Las WILSON” 


The paper, “On tke Optimal Tax Base 
for Commodity Taxation,” by John Douglas 
Wilson (American Economic Review, De- 
cember 1989, Volume 79, No. 5, pp. 
1196-1206) contains several typographical 
errors. Equation (12) on page 1200 should 
be 


(12) y: WW /dy 


+d[t(db dy) y — dC /dy] =0. 
Equation (19) on page 1201, although cor- 


rect if properly interpreted, should be writ- 
ten in the following fcrm: 


dMB(o,¥,t*(e,y,R)) 


(19) ay 


*Department of Econorics, Ballantine Hall, Indi- 
ana University, Bloomington, IN 47405. 
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: dIMB(a,y,t*(o,y,R)) 
ot 
dt*(o,y,R 
` (€, y, R) 
oy 
- ——— <0. 
dy 


The reference to Subsection II.B in the last 
paragraph on page 1201 should be to Sub- 
section I.B. The last line of equation (A2) 
on page 1205 should be 


x b(1— b)log(1—- 14). 


Finally, the first line of equation (A7) on 
page 1205 should be 


(A7) L,,=—Al1+(1—-b)(e@-1) 
(i.e., L,,, not L,). 


ERRATUM 


The Future of the Income Tax 


By JosEpH A. PECHMAN* 


A serious error appeared in Joseph A. 
Pechman’s 1989 Presidential Address, which 
was published in the March 1990 issue of 
The American Economic Review. The erro- 
neous passage occurs on page 7 at the end 
of the second full paragraph. 


*Joseph Pechman passed away on August 19, 1989. 
This correction was communicated by Henry Aaron, 
The Brookings Institution, 1775 Massachusetts Av- 
enue, N.W., Washington, DC 20036. 
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The original text reads as follows: “Burt- 
less estimates. that the Reagan tax and 
transfer policies increased average annual 
taxes of men aged 25-54 by no more than 
2—4 percent and of women in the same age 
group by no more than 3.5 percent.” The 
text should read: “Burtless estimates that 
the Reagan tax and transfer policies in- 
creased average annual labor supply of men 
aged 25-54 by no more than 2—4 percent 
and of women in the same age group by no 
more than 3.5 percent.” (Emphasis added.) 
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